Requirements

AWS Services

The following list of AWS services must be available and usable in the targeted AWS Account where a CCME cluster is supposed to be configured and used.

AWS ParallelCluster v3 services

Services required by AWS ParallelCluster v3 are described in the official AWS ParallelCluster online documentation). We don’t copy here all the services required by ParallelCluster, and refer the reader to this documentation.

Additional mandatory AWS services

There are a few additional services and tools that are required for CCME:

  • AWS Simple Notification Service (SNS): service used to send email or SMS notifications about clusters-related events

  • AWS Budgets: service used to configure budgets and send alerts (through AWS SNS) when costs related to AWS resources consumption reaches predefined thresholds

  • AWS Cost Explorer: service used to monitor almost real-time costs related to AWS resources consumption in the account

  • AWS Cost and Usage Report: service used to generate costs and usage reports and store them automatically in a s3 bucket

  • AWS Price List: service used to provide up-to-date AWS resources price at any time to prediction tools running in the backend of the clusters

  • AWS Directory Service: service used to create and manage users directories to be attached to clusters for users authentication

  • AWS Key Management Service (KMS): service used to deliver encryption keys for all at-rest and in-transit data in the cloud

  • AWS Secrets Manager: service used to store in a secure way critical information like passwords for elevated privileges accounts

  • AWS Certificate Manager: service used to store and manage security certificates in the cloud

  • AWS Elastic Load Balancing: service used to deliver scalable Load Balancing and proxying capabilities

  • NICE DCV: AWS software used to deliver remote visualization on Linux and Windows desktops

  • NICE EnginFrame: AWS software used to deliver HPC-as-a-Service in a web user interface

Additional optional AWS services

Optional services that UCit strongly recommend to enable in the AWS account where the customer’s HPC clusters should run:

  • AWS Systems Manager (SSM): service that can be used to manage EC2 instances at system level, including for example SSH-like access through HTTPS

  • AWS Simple Queue Service (SQS): service that can be used to build up execution workflows by chaining tasks together, loosely and asynchronously, through open messages

  • Amazon Relational Database Service (RDS): service that can be used to store scalable and highly available databases in the cloud

  • AWS CloudTrail: service that can be used to record and store any API call executed within the AWS account

  • AWS Athena: service that can be used to query and filter CloudTrail logs stored in a S3 bucket

  • AWS Cloud9: service that can be used to build up customer-specific IDEs in the cloud

  • Amazon WAF & Shield: service that can be used to improve security at AWS account’s user access points level

  • AWS Backup: service that can be used to create cloud-based backup of critical data stored in the cloud

  • AWS Snow Family: service that can be used to import big amounts of data to the cloud (mainly when live network-based transfers are not possible)

  • AWS Billing: service that can be used to retrieve detailed billing information on the AWS account

AWS Environment

S3 Storage

For CCME storage needs, it is necessary to create 2 dedicated private buckets with the AWS S3 service:

  • One for the CCME solution (scripts, templates and configuration files), defined as follows:

    • Name (example): ucit-ccme-internals-eu-west-1

    • Default properties

  • One for the long time storage of data, defined as follows:

    • Name (example): ucit-ccme-userdata-eu-west-1

    • Default properties

  • For both of those buckets:

    • Block public access: block all

      • BlockPublicAcls: True

      • BlockPublicPolicy: True

      • IgnorePublicAcls: True

      • RestrictPublicBuckets: True

    • Versioning configuration: Enabled

    • Bucket Encryption: Enable encryption by default using AES256

    • S3 Bucket Policy allowing only TLS > 1.2

      Using EXAMPLE_OF_CCME_CLUSTER_S3_BUCKET_NAME as example of bucket name.

      • Format as yaml

        Version: 2012-10-17
        Id: CCMEBucketPolicy
        Statement:
          - Sid: enforce-tls-12-requests-only
            Effect: Deny
            Principal:
              AWS: '*'
            Action: '*'
            Resource:
              - 'arn:aws:s3:::EXAMPLE_OF_CCME_CLUSTER_S3_BUCKET_NAME/*'
              - 'arn:aws:s3:::EXAMPLE_OF_CCME_CLUSTER_S3_BUCKET_NAME'
            Condition:
              NumericLessThan:
                's3:TlsVersion': '1.2'
          - Sid: enforce-tls-requests-only
            Effect: Deny
            Principal:
              AWS: '*'
            Action: '*'
            Resource:
              - 'arn:aws:s3:::EXAMPLE_OF_CCME_CLUSTER_S3_BUCKET_NAME/*'
              - 'arn:aws:s3:::EXAMPLE_OF_CCME_CLUSTER_S3_BUCKET_NAME'
            Condition:
              Bool:
                'aws:SecureTransport': 'false'
        
      • Format as JSON:

        {
            "Version": "2012-10-17",
            "Id": "CCMEBucketPolicy",
            "Statement": [
                {
                    "Sid": "enforce-tls-12-requests-only",
                    "Effect": "Deny",
                    "Principal": {
                        "AWS": "*"
                    },
                    "Action": "*",
                    "Resource": [
                        "arn:aws:s3:::EXAMPLE_OF_CCME_CLUSTER_S3_BUCKET_NAME/*",
                        "arn:aws:s3:::EXAMPLE_OF_CCME_CLUSTER_S3_BUCKET_NAME"
                    ],
                    "Condition": {
                        "NumericLessThan": {
                            "s3:TlsVersion": "1.2"
                        }
                    }
                },
                {
                    "Sid": "enforce-tls-requests-only",
                    "Effect": "Deny",
                    "Principal": {
                        "AWS": "*"
                    },
                    "Action": "*",
                    "Resource": [
                        "arn:aws:s3:::EXAMPLE_OF_CCME_CLUSTER_S3_BUCKET_NAME/*",
                        "arn:aws:s3:::EXAMPLE_OF_CCME_CLUSTER_S3_BUCKET_NAME"
                    ],
                    "Condition": {
                        "Bool": {
                            "aws:SecureTransport": "false"
                        }
                    }
                }
            ]
        }
        

Virtual Private Cloud

CCME inherits from AWS ParallelCluster the ability to create on the fly for each cluster the network environment in which said cluster will run.

For the network needs of your future CCME clusters, it is possible to create in advance a network environment specifically adapted to your needs with the AWS VPC service. In this case, you can create:

  • VPC with the following attributes:

    • Name (ex: VPC-CCME)

    • CIDR block large enough to include subnets (ex: 10.0.0.0/16)

    • CIDR IPv6: false

    • “Tenancy” parameter: Default

    • Options:

      • DNS Resolution: yes

      • DNS Hostnames: yes

    • Also check that there is a set of DHCP options created by default with the following value:

      • domain-name = <REGION>.compute.internal; domain-name-servers = AmazonProvidedDNS;

      Where the value of <REGION> is set relative to the AWS Region in which the environment is created, e.g., Ireland region = eu-west-1

  • One VPC EndPoint with the following attributes:

    • Name (example): CCME-S3Endpoint

    • Type: Gateway

    • Service: com.amazonaws.{{ REGION }}.s3

  • Subnets with the following attributes:

    • Two FrontEnd subnet

      • Name (example): CCME-FrontEnd-subnet-az1 and CCME-FrontEnd-subnet-az2

      • CIDR compatible with the previously created VPC (ex: 10.0.1.0/24 and 10.0.2.0/24)

      • Availability Zone: The two subnets must be in different availability zones.

      • Note: Those subnets must be public if the Application Load Balancer should be accessible through Internet

    • Two BackEnd subnet

      • Name (example): CCME-BackEnd-subnet-az1 and CCME-BackEnd-subnet-az2

      • CIDR compatible with the previously created VPC (ex: 10.0.3.0/24 and 10.0.4.0/24)

      • Availability Zone: The two subnets must be in different availability zones

      • Note: Those subnets must be private for security reasons but can be public to allow CMH and clusters with public IP

For the network configuration and internet access of the ALB, CMH and clusters there are three possibilities:

  1. Both FrontEnd subnets and BackEnd subnets are public

  • One Internet Gateway (IGW) with the following attributes:

    • Name (example): CCME-IGW

    • Attached to: The previously created VPC

    • Subnet: You need to create that Internet Gateway within one of the previously created FrontEnd subnets

    • After the creation:

      • Update the route table (rt) of the FrontEnd subnets

        • Destination: 0.0.0.0/0

        • Target: ID of the Internet Gateway

      • Update the route table (rt) of the BackEnd subnets

        • Destination: 0.0.0.0/0

        • Target: ID of the Internet Gateway

  1. FrontEnd subnets are public and BackEnd subnets are private

  • One Internet Gateway (IGW) with the following attributes:

    • Name (example): CCME-IGW

    • Attached to: The previously created VPC

    • Subnet: You need to create that Internet Gateway within one of the previously created FrontEnd subnets

    • After the creation:

      • Update the route table (rt) of the FrontEnd subnets

        • Destination: 0.0.0.0/0

        • Target: ID of the Internet Gateway

  • One NAT Gateway with the following attributes:

    • Name (example): CCME-NGW

    • Attached to: The previously created VPC

    • Connectivity type: Public

    • Subnet: You need to create that NAT Gateway within one of the previously created FrontEnd subnets

    • After the creation:

      • Update the route table (rt) of the BackEnd subnets

        • Destination: 0.0.0.0/0

        • Target: ID of the Network Gateway

  1. Both FrontEnd subnets and BackEnd subnets are private

  • One Public subnet

    • Name (example): CCME-Public-subnet-az1

    • CIDR compatible with the previously created VPC (ex: 10.0.5.0/24)

    • Availability Zone: One of the availability zones used for the FrontEnd and BackEnd subnets.

    • Note: Those subnets must be public if the Application Load Balancer should be accessible through Internet

  • One Internet Gateway (IGW) with the following attributes:

    • Name (example): CCME-IGW

    • Attached to: The previously created VPC

    • Subnet: You need to create that NAT Gateway within one of the previously created Public subnets

    • After the creation:

      • Update the route table (rt) of the Public subnet

        • Destination: 0.0.0.0/0

        • Target: ID of the Internet Gateway

  • One NAT Gateway with the following attributes:

    • Name (example): CCME-NGW

    • Attached to: The previously created VPC

    • Connectivity type: Public

    • Subnet: You need to create that NAT Gateway within one of the previously created Public subnets

    • After the creation:

      • Update the route table (rt) of the FrontEnd subnets

        • Destination: 0.0.0.0/0

        • Target: ID of the Network Gateway

      • Update the route table (rt) of the BackEnd subnets

        • Destination: 0.0.0.0/0

        • Target: ID of the Network Gateway

Service Secrets Manager

For the security needs of future access to the directory service (Active Directory or LDAP, in AWS or not), it is necessary to create a “secret” with the Secrets Manager service in order to store the password of a user with read rights to the directory service.

The created secret must be of type String and written as plaintext.

SSL Certificates

For the security needs of access to future CCME clusters, it is necessary to create an HTTPS (X.509) certificate signed by a certification authority, or to import a pre-existing certificate, into the Certificate Manager service. This certificate will be used/associated with the Application Load Balancer (ALB).

Supported Operating Systems

CCME supports the following OS (Image.Os parameter of ParallelCluster configuration file):

  • CentOS 7 x86_64 (centos7)

  • Amazon Linux 2 x86_64 and aarch64 (alinux2)

  • Red Hat Entreprise Linux 8 x86_64 (rhel8)

CCME uses the default ParallelCluster AMIs when launching a cluster. It is possible to use a custom-made AMI by following the following documentation: Building a custom AWS ParallelCluster AMI.

The CCME Management Host supports the following OS (management_host_os parameter of deployment configuration file):

  • Amazon Linux 2 x86_64 and aarch64 (alinux2)

  • Red Hat Entreprise Linux 8 x86_64 (rhel8)

Packages and dependencies

Multiple third-party software installers are used by CCME. In a case of a private deployment the next packages must be placed in: - CCME/pkgs/deps/ - management/pkgs/deps

Please refer to the dependencies.yaml file regarding the version of the packages. For all dependencies, two variables are available: - arch allows x86_64 or aarch64 - os allows el7 or el8

Management Host:
  • aws-cfn-bootstrap-py3-latest.tar.gz

  • amazon-ssm-agent.gpg

  • amazon-ssm-agent.rpm

  • amazon-ssm-agent.rpm.sig

Clusters:
  • Multi-arch:

    • NICE-GPG-KEY

    • MariaDB-Server-GPG-KEY

    • amazon-ssm-agent.gpg

    • enginframe-${enginframe[version]}-r${enginframe[revision]}.jar

  • arch = x86_64:

    • amazon-ssm-agent.rpm

    • amazon-ssm-agent.rpm.sig

    • galera-${galera.os.arch}.rpm

    • MariaDB-client-${mariadb.os.arch}.${arch}.rpm

    • MariaDB-common-${mariadb.os.arch}.${arch}.rpm

    • MariaDB-compat-${mariadb.os.arch}.${arch}.rpm

    • MariaDB-server-${mariadb.os.arch}.${arch}.rpm

    • nice-dcv-gl-${dcv.version}.${dcv.gl}.${os}.${arch}.rpm

    • nice-dcv-gltest-${dcv.version}.${dcv.gltest}.${os}.${arch}.rpm

    • nice-dcv-server-${dcv.version}.${dcv.patch}-1.${os}.${arch}.rpm

    • nice-dcv-session-manager-agent-${dcvsm.version}.${dcvsm.agent_patch}.${os}.${arch}.rpm

    • nice-dcv-session-manager-broker-${dcvsm.version}.${dcvsm.broker_patch}.${os}.noarch.rpm

    • nice-dcv-simple-external-authenticator-${dcv.version}.${dcv.sea}.${os}.${arch}.rpm

    • nice-dcv-web-viewer-${dcv.version}.${dcv.patch}-1.${os}.${arch}.rpm

    • nice-xdcv-${dcv.version}.${dcv.xdcv}.${os}.${arch}.rpm

  • arch = aarch64:

    • amazon-ssm-agent.rpm

    • amazon-ssm-agent.rpm.sig

    • galera-${galera.os.arch}.${arch}.rpm

    • MariaDB-client-${mariadb.os.arch}.${os}.centos.${arch}.rpm

    • MariaDB-common-${mariadb.os.arch}.${os}.centos.${arch}.rpm

    • MariaDB-compat-${mariadb.os.arch}.${os}.centos.${arch}.rpm

    • MariaDB-server-${mariadb.os.arch}.${os}.centos.${arch}.rpm

    • nice-dcv-server-${dcv.version}.${dcv.patch}-1.${os}.${arch}.rpm

    • nice-dcv-session-manager-agent-${dcvsm.version}.${dcvsm.agent_patch}.${os}.${arch}.rpm

    • nice-dcv-session-manager-broker-${dcvsm.version}.${dcvsm.broker_patch}.${os}.noarch.rpm

    • nice-dcv-simple-external-authenticator-${dcv.version}.${dcv.sea}.${os}.${arch}.rpm

    • nice-dcv-web-viewer-${dcv.version}.${dcv.patch}-1.${os}.${arch}.rpm

    • nice-xdcv-${dcv.version}.${dcv.xdcv}.${os}.${arch}.rpm

    • s3fs-fuse-${s3fsfuse.arch}.${os}.${arch}.rpm