Releases

Release 5.0.0 - October 02, 2023

Updated EULA to version 2.3.

NEW

CCME Roles Stack (CRS)

  • Allow the HeadNode and ComputeNodes to perform the following actions on all ALB

    • elasticloadbalancing:DescribeLoadBalancerAttributes

    • elasticloadbalancing:DescribeListeners

    • elasticloadbalancing:DescribeRules

    • elasticloadbalancing:DescribeTags

  • Allow the Management Host Stack to perform the following actions

    • elasticloadbalancing:AddTags on the ALB created by CCME

    • ec2:CreateTags on the all network-interfaces resources

    • cloudformation:CreateChangeSet

CCME Management Host (CMH)

  • Support RedHat Enterprise Linux 8 (RHEL8) for the CCME Management Host

  • Add optional proxy, no_proxy, yum repository and pip repository as variables for the CCME Management Host (CMH) and the clusters

  • Add optional custom AMI parameter

  • Add optional security-group parameter

  • Added the possibility to generate custom ParallelCluster configuration files from Jinja2 templates

Clusters

  • Support RedHat Enterprise Linux 8 (RHEL8) for Headnodes and Compute nodes

  • CCME Ansible playbooks have been refactored in an Ansible role

  • New visualization Windows fleet including

    • A Windows fleet launch template deployed by the CCME Management Host (CMH)

    • New variables CCME_WIN_ availables for CCME clusters to configure the Windows fleet (AMI, instance type, configuration files…)

  • EnginFrame

    • Dynamically generate an EnginFrame service for remote visualization for:

      • the headnode

      • each dcv* queue in Slurm

      • Windows fleet

    • Renamed the DCVSM cluster name in EnginFrame from headnode to dcvsm, and made its hosts visible in the EnginFrame Hosts service

    • Add the possibility to automatically register all users belonging to the CCME_EF_ADMIN_GROUP OS group as administrators of EnginFrame

    • Add CCME_EF_ADMIN_PASSWORD as AWS Secret arn parameter to store the EnginFrame admin (efadmin) password for clusters

  • Add the possibility to encrypt all storages using multiple KMS keys at deployment with variables like ccme_kms_

  • Add a custom playbook to fix Nvidia drivers CVE named example.install-fix-nvidiacve-all.yaml

    • The Nvidia driver version is defined in dependencies.yaml through the parameter nvidia_version (set to 515.48.07)

    • Verify Nvidia Drivers presence for CCME clusters in CCME sources, download and install if not present

  • Add default encryption for clusters root volumes for the cluster configuration files

CCME logs are now sent to CloudWatch

  • For the CMH, a new log group name ccme-cmh-<stackID> is created. The following logs are available:

    • CCME logs: /var/log/ccme.*.log

    • Cloud init and cfn init logs: /var/log/cloud-init.log, cloud-init-output.log, cfn-init.log

    • System logs: /var/log/messages and syslog

    • SSD logs: /var/log/sssd/sssd.log and sssd_default.log

  • For each cluster, the logs are sent in the same subgroup as the log group of the cluster (see Amazon CloudWatch Logs cluster logs). The following logs are now available for each cluster:

    • On Head and Compute nodes:

      • All CCME pre/post/update Bash and Ansible scripts logs, including custom scripts

      • DCV logs (for Compute nodes belonging to a dcv* Slurm partition)

    • On the Headnode:

      • EnginFrame logs

      • DCVSM broker and agent logs

ENHANCEMENTS

Documentation

  • Add Active Directory users troubleshoot section to the CCME documentation

  • The documentation requirements relating to the AWS network environment has been updated

    • Information relating to the subnets requirements are more explicit

    • Add specification for Internet Gateway and Network Gateway depending on multiple networks cases

  • The help function of the deployCCME.sh script is more verbose

CCME Management Host (CMH)

  • The resources created by the CMH stack base their name on the CMH stack id instead of CMH stack name

  • Upgrading AWS ParallelCluster to version 3.6.0

  • The parameters ccme_bucket and ccme_bucket_subfolder are merged to a new parameter ccme_bucket_path

Clusters

  • Gnome initial setup is disabled on HeadNode and DCV nodes

  • The ALB URL can be replaced by a custom DNS name using the new CCME_DNS variable. This can be used to redirect both EnginFrame and DCV URLs through CCME_DNS.

  • Improve the robustness and idempotency of the Ansible tasks

  • Upgrading EnginFrame to version 2021.0-r1667

  • Using native Nvidia Parallelcluster drivers (version 470.141.03) to reduce clusters deployment by approximately 2 minutes

  • The name of the CCME configuration file uploaded available in /opt/CCME/conf/ is now based on its CMH name (e.g., CMH-Myfirstcmh.ccme.conf)

Security

  • Improve the security by adding restriction on CloudFormation usage based on stack tags

  • Improve the CCME security by using IMDS v2 token to retrieve EC2 metadata

Other
  • CRS CloudFormation template has been split into several templates to fit into template size restrictions for CloudFormation

  • CMH CloudFormation template has been split into several templates to fit into template size restrictions for CloudFormation

  • CMH stack tags are now propagated to:

    • CMH EBS

    • CMH ActiveDirectory (if created by the CMH stack)

BUG FIXES

CCME Roles Stack (CRS)

  • Fixed iam:PassRole with the parameter CustomIamPathPrefix in the CCME Roles Stack (CRS)

  • Fixed missing optional AWS Route53 policies in the CCME Roles Stack

  • Fixed ec2:RunInstances authorization for the computes deployment in placement-group

  • Fixed tags associated to the CCME Roles Stack (CRS) and CCME Management Host (CMH) deployed with the deployCCME.sh script

  • Fixed CCME Management Host (CMH) ALB policy allowing to update the Application Load Balancer (ALB) certificate with elasticloadbalancing:ModifyListener

CCME Management Host (CMH)

  • Fixed tag Name for the CCME Management Host (CMH)

  • Fixed no multiple EBS storage on CMH

  • Fixed optional SNS notification at cluster deployment

    • The CCME Management Host (CMH) parameter CCMESnsALB becomes CCMEAdminSnsTopic

    • The default value is now NONE instead of *

  • On the Management Host /var/log/ccme.ccme-start.log now correctly displays the logs on individual lines

  • Fixed FSx policies (fsx:Describe*) for deployment with FSxOnTap storage

Clusters

  • Fixed JWT headers decoding when using OIDC authentication

  • Fixed management_stack_role variable description in the deployment configuration file of CCME

  • Fixed dependencies downloading when external repositories take time to respond

  • Fixed url redirection of the EnginFrame logo

  • Fixed install of the S3FS-Fuse latest version

    • x86_64 (1.9*)

    • aarch64 (== 1.93-1)

  • Fixed S3FS mount point with aarch64 architecture and IMDs v2

  • Fixed configuration file for ARM clusters

  • Fixed AWS SSM installation when a previous version is already installed

  • Fixed EFA usage on computes with the compute security group deployed with the CCME Management Host (CMH) stack

  • Fixed classic cluster configuration file

  • Fixed /opt/CCME NFS export: compute nodes can now be in different networks and AZs than the headnode

  • Fixed mode for the CCME env file

  • Fixed MariaDB installation with aarch64 for ALinux 2 OS with the next packages:

    • MariaDB-Server: 10.6.9-1

    • Galera: 4-26.4.12-1

  • Fixed CCME ALB Lambda policy, explicitly allowing the lambda to:

    • logs:CreateLogStream

    • logs:PutLogEvents

    • elasticloadbalancing:AddTags

  • Fixed compute egress security group

  • Fixed versions for pip packages

  • Fixed retrieval of pricing data through Slurm epilog script: use IMDSv2 to retrieve metadata

  • Fixed hosts status in EnginFrame when they are in IDLE+CLOUD Status in Slurm

Release 4.2.0 - May 17, 2023

NEW

  • CCME now supports in AWS region Stockholm (eu-north-1)

  • AWS IAM Roles support for CCME management, lambdas and clusters

  • Automated deployment of the CCME Roles Stack (CRS) with deployCCME.sh

ENHANCEMENTS

  • CCME dependencies packages are not required anymore

  • Upgrading DCV to 2023.0 (15065)

  • Upgrading DCVSM to 2023.0

    • Broker: 392-1

    • Agent: 675-1

BUG FIXES

  • Add fix to anticipate the resolution of a colord profile dialog box issue in virtual DCV sessions

  • Add fix to remove screenlock in Gnome screensaver settings

  • Add fix to force the minimize and maximize buttons to appear in the top right corner of the Windows in Gnome-based DCV sessions

Release 4.1.0 - April 21, 2023

NEW

  • Amazon SSM is now installed on the CCME Management Host (CMH)

ENHANCEMENTS

  • Separation of public and privates subnets for security improvement

    • The component Application Load Balancer is created in public subnets separated from other components

    • The components Active Directory, Management Host and the clusters are now in privates subnets

  • Upgrading PCluster to 3.5.0

  • Upgrading Ansible to 4.10.0

  • Upgrading Pip to 23.0.1

  • Support usage of IMDs v2 for CCME clusters

BUG FIXES

  • xorg.conf is now configured correctly for DCV on instances equipped with GPUs with HardDPMS option set to false (and option UseDisplayDevice removed)

  • Amazon SSM usage on clusters, including HeadNodes and ComputeNodes

  • Cluster time-out, including separated variable for HeadNode and ComputeNode are now configurable

  • Cluster update is now working correctly

  • The first visualization session / the first job starting dynamic node is now executed correctly after the end of node configuration

  • Fixed ALB rules creation for DCV nodes when lots of nodes are deployed at the same time.

  • EnginFrame services are not reset when the cluster is updated

Release 4.0.0 - January 24, 2023

NEW

  • Multiple S3 Buckets can now be mounted through S3FS

  • Cluster can be deployed in VPCs different than the CCME Management’s VPC

  • Support pre-existing AWS Application Load Balancer (ALB)

  • Support pre-existing Active Directory internal/external to AWS using LDAP

  • Support list of key:value tags for CCME Management Host and Clusters

  • Integration of custom scripts execution

  • Support optional authentication with OIDC to the EnginFrame portal

  • Support mounted files systems as mountpount for user home by setting the fallback_homedir option in sssd

  • Support timezone configuration for CMH and cluster instances

ENHANCEMENTS

  • Enforce TLS requirements in CCME S3 policies

  • Upgrading PCluster to 3.2.0

  • Upgrading Slurm to 21.08.8-2

  • Upgrading DCV to 2022.1 (13067)

  • Upgrading DCVSM to 2022.1

    • Broker: 355-1

    • Agent: 592-1

  • Upgrading EnginFrame to 2021.0-r1646

  • Updrading Nvidia Drivers to 515.48.07

  • Deploy DCV on compute nodes depending of the presence of dcv in partition(s) name(s)

  • No configuration action is required to start a first pre-configured cluster

  • The cluster policies are now generated by the ManagementHost

  • The cluster component named master have been renamed to headnode

  • Possibility to specify the authorized CIDR for the frontend ALB

  • Automated creation of private S3 bucket to use as the AWS ParallelCluster CustomS3Bucket configuration

  • Management Host public IP configuration can be set to NONE

DEPRECATED

  • Removing CCME Command Line Interface (CCME-CLI) support

  • Removing Ganglia support

Release 3.0.0 - March 23, 2021

NEW

  • Adding a common secured balanced https entry point

    • EnginFrame portal

    • Ganglia

    • DCV sessions

  • Adding dedicated stack to deploy Management Host

  • Adding user centralized authentication through directory services (AD)

    • Secure access to cluster through selected groups in the AD

    • Secure access to Management Host through selected groups in the AD

  • Adding a CCME Command Line Interface for Management Host

    • Start, Stop, Update, Delete a cluster

    • Possibility to set a time-to-live to a cluster

  • Updating HeadNode so that it uses DCVSessionManager as its session viewer

  • Adding documentation

ENHANCEMENTS

  • Upgrading to AWS ParallelCluster 2.10.1

  • Updating Slurm to 20.02.4

  • Upgrading DCV to 2020.2 (9508)

  • Upgrading EnginFrame to 2020.0-r58

  • Adding option to specify on which partition(s) DCV should be deployed

DEPRECATED

  • Removing BeeGFS support

BUG FIXES

  • Fixing S3 Bucket policies