Releases
Release 5.7.0 - August 02, 2024
NEW
Clusters
CCME_SHARED_DIR
must now be defined to specify where CCME will store its shared files (e.g., EnginFramespoolers
andsessions
directories). In CCME default templates, this value is set toshared
, which is the mountpoint of an EFS.
ENHANCEMENTS
Clusters
Password files and keystores are now available only on the HeadNode in
/opt/CCME_internal
.Updated troubleshooting section in the documentation.
Added option
CCME_EF_MAX_UPLOAD_SIZE
to set the maximum size allowed for uploaded files, in bytes (default is 1024 * 1024 * 1024 * 40 (40GB):4294967296
).
BUG FIXES
Clusters
Fix the startup of
slurmctld
during a stop/restart of HeadNodesFixed AWS SSM gpg key import for Rocky Linux 9 (Rocky9) and RedHat Enterprise Linux 9 (RHEL9)
CCME Management Host (CMH)
Fix LDAPS URI for CCME Management Host, replacing
CCMEAdIPs
byCCMEAdUri
Warning
CCME_SHARED_DIR
is now a new and mandatory parameter. It must be set to a valid mountpoint specified in one of AWS ParallelClusterSharedStorage
(see this link for more details).We recommend that you use a file system that can sustain your needs. EFS can be a good choice to have “unlimited” storage that will automatically grow to your needs.
CCME_SHARED_DIR
must not be changed during a cluster update. Updating it may break EnginFrame setup.
Warning
The parameter CCMEAdIPs
no longer exists in the CCME Management Host stack and deployment configuration files.
You must use CCMEAdUri
for any new deployment.
Release 5.6.3 - July 10, 2014
BUG FIXES
CCME Management Host (CMH)
Updated AWS ParallelCluster to version
3.9.3
Release 5.6.2 - July 05, 2024
BUG FIXES
Added
lambda:ListTags
permissions to ALB Lambda. Previously, permissions onListTags
were required only when using theListTags
API explicitly. However, principals withGetFunction
API permissions could still access tag information outputted by theGetFunction
call. Beginning July 27, 2024, Lambda will return tags data only when the principal callingGetFunction
API has a policy with explicit allow permission onListTags
API.
Release 5.6.1 - July 04, 2024
ENHANCEMENTS
License Management: print error message if any when calling lmstat.
BUG FIXES
When building a CCME AMI, Ansible facts were not cleaned up, leading to issues if the AMI was used within 2 hours after its creation (due to Ansible facts cache timeout).
Fixed failed condition for
sacctmgr
withdefault
andefadmin
accounts creation
Release 5.6.0 - June 12, 2024
NEW
CCME Roles Stack (CRS)
Allow the Management Host Stack to perform the following actions on the CCME Management Host stack resources:
ec2:DeleteTags
CCME Management Host (CMH)
Support RedHat Enterprise Linux 9 (RHEL9) for the CCME Management Host
Add optional
MgtHostVolumeSize
for custom CCM Management Host volume size
Clusters
Add sample custom scripts in
CCME/custom/
:example.install-ldapfilters-compute.yaml
example.install-paraview-head.yaml
example.install-singularity-all.yaml
example.install-updatenvidiadrivers-all.yaml
Added a script (
CCME/sbin/prepare-ami.sh
) to create a CCME AMI from an AWS ParallelCluster AMI. The process downloads external packages and preinstall some of them. See this section for more information.Support RedHat Enterprise Linux 9 (RHEL9) for Headnodes and Compute with
x86_64
andaarch64
architectureSupport Rocky Linux 9 (Rocky9) for Headnodes and Compute nodes with
x86_64
andaarch64
architecture
Warning
CCME does not yet support DCV on instances with Nvidia GPU on Rocky Linux 9.
ENHANCEMENTS
CCME Management Host (CMH)
Updated AWS ParallelCluster to version
3.9.2
Clusters
Updated Slurm to version
23.11.7
Updated S3FS-Fuse to version
1.94-1
foraarch64
BUG FIXES
CCME Roles Stack (CRS)
Fix the documentation related to ccme_secret_prefix as CCME Roles Stack parameter
CCME Management Host (CMH)
Fixed secrets correct usage by removing the secrets_manager_prefix parameter from the deployment.ccme.conf configuration file
Fixed the ParallelCluster configuration file template with a ResourcePrefix depending on the CCME Roles Stack
Fix the authentication on the CCME Management Host with Active Directory users
Fix the home directory on the CCME Management Host with Active Directory users
Clusters
Fixed
MariaDB
version to10.6.18
forx86_64
Fixed
Fuse3
version to3.3.0-18
forRocky
Authentication on EnginFrame portal with
aarch64
architecture
Release 5.5.1 - April 30, 2024
NEW
CCME Roles Stack (CRS)
Add a tag named
ccme:version
to the CCME Roles Stack when deployed with thedeployCCME
script
CCME Management Host (CMH)
Add a tag named
ccme:version
to the CCME Management Host stack when deployed with thedeployCCME
scriptAdd a tag named
ccme:version
to the cluster templates
Clusters
Add a variable named
CCME_VERSION
to theCCME_ENV_FILE
andCCME_ENV_SH_FILE
ENHANCEMENTS
CCME tarball is now created without extended file attributes, as it can cause errors with old versions of tar when extracting the archive.
BUG FIXES
CCME Management Host (CMH)
Fixed ccmeLambdaALBLogGroup path for ccmeLambdaALB in CMH template
Release 5.5.0 - April 02, 2024
NEW
Clusters
License management with Slurm: you can now specify a list of licenses hosted by FlexLM license servers that will be added as resources in Slurm and periodically updated by querying the license servers. This allows to block jobs in queue while waiting for licenses to be available.
Added the possibility to monitor inactive Linux DCV sessions, and automatically terminate them after
CCME_LIN_INACTIVE_SESSION_TIME
seconds.CCME_LIN_INACTIVE_SESSION_TIME
can be configured in ParallelCluster configuration file or directly in EnginFrame services.
CCME Management Host (CMH)
Add
ccme_logs_retention_in_days
as CCME Management Host deployment parameter to configure how long logs should be retained in CloudWatch.Enhanced security
HTTP headers with invalid header fields are now removed by the load balancer (
drop_invalid_header_fields
is enabled),Ensure that Load Balancer Listener is using at least TLS v1.2,
Limit the number of concurrent executions of CMH Lambda to 10.
Documentation
Added ActiveDirectory LDAPS documentation.
Added Simple AD management documentation.
ENHANCEMENTS
CCME Management Host (CMH)
Add the Arn of the resources created by the CCME Management Host Stack to the stack output.
Clusters
Use the
OnNodeUpdated
args instead ofOnNodeStart
parameters for CCME cluster deployments. This allows to update “any” parameter of CCME during a cluster update, without having to duplicate them.Updated DCV Server version to
2023.1
and patch to16388
.Updated DCVSM version to
2023.1
including:DCVSM Broker to
410-1
DCVSM Agent to
732-1
Added information about
NTFS Unix Security Options
for ONTAP.Refresh the Active Directory
ReadOnlyUser
password when the cluster is updated.Add multiple CCME environment variable files
CCME_ENV_FILE
: Path to/etc/ccme/ccme.env.yaml
CCME_ENV_SH_FILE
: Path to/etc/ccme/ccme.env.sh
Add the following environment variables to the CCME environment variable files (
CCME_ENV_FILE
andCCME_ENV_SH_FILE
):CCME_NODE_DCV
CCME_NODE_PARTITION
Added examples in
CCME/custom/winec2_config.ps1
tojoin an Active Directory,
at user logon mount Ontap volumes and export specific variables.
Added the possibility to have a specific
CCME/custom/CLUSTERNAME.ef-services.zip
(withCLUSTERNAME
the name of the cluster) to specify a list of EnginFrame services to publish specifically for this cluster. TheCCME/custom/ef-services.zip
can still exist, but is ignored ifCLUSTERNAME.ef-services.zip
is present.Add two variables to the OpenID (OIDC) cluster configuration
UsernameAttribute
is optional and allows to select the user attribute used as username,UserMapping
is optional and allows to map every user to the same portal service account.
Add custom
decode_jwt_headers
python script inCCME/custom/
for OpenID (OIDC).
Documentation
Updated the ActiveDirectory documentation
adcli
commandsldapsearch
andldapmodify
ldif files
BUG FIXES
CCME Management Host (CMH)
Fixed download on large files through EnginFrame: increased the ApplicationLoadBalancer
idle timeout
to300s
.Fixed usage of the latest version of the launch template for Windows DCV instances.
Automatically clean Windows sessions that have been closed on the instance, but were still visible in EnginFrame in an unknown state.
Clusters
Fixing
buffer_pool_size
andlock_wait_timeout
errors from MariaDB server.Do not download Slurm tarball when the current version is already the correct one.
Force restart NFS server on HeadNode.
Warning
All CCME variables in AWS ParallelCluster configuration file have been migrated
from OnNodeStart.Args
to OnNodeUpdated.Args
. You should now specify them
only in OnNodeUpdated.Args
.
Warning
The following variables have been renamed for consistency. If you used any of them in your configurations, please update them:
CCME_EF_ADMIN_PASSWORD
has been renamed toCCME_EFADMIN_PASSWORD
: ARN of an AWS Secret that stores the password ofefadmin
,CCME_EF_ADMIN_SUDOER
has been renamed toCCME_EFADMIN_SUDOER
: isefadmin
sudoer or not,CCME_EF_ADMIN_USER
has been renamed toCCME_EF_ADMIN_USER
: username ofefadmin
(defaultefadmin
)
Release 5.4.1 - February 19, 2024
BUG FIXES
Clusters
Fixing EnginFrame publish services when OIDC feature is enabled
Add
PrivateData
inScheduling.SlurmSettings.CustomSlurmSettings
. It is not sufficient to specify it inCCME_CUSTOM_SLURMDBD_SETTINGS
.Fixed exportfs from HeadNodes to their VPC when they more than one CIDR
Release 5.4.0 - February 02, 2024
NEW
CCME Roles Stack (CRS)
Allow the Management Host Stack to perform the following actions on the CCME Management Host stack resources, they are needed during an update:
ec2:StopInstances
ec2:ModifyInstanceAttribute
ec2:StartInstances
ec2:ReplaceIamInstanceProfileAssociation
elasticloadbalancing:SetSubnets
lambda:UpdateFunctionCode
lambda:UpdateFunctionConfiguration
lambda:ListTags
sns:SetTopicAttributes
sns:ListSubscriptionsByTopic
CCME Management Host (CMH)
Updated AWS ParallelCluster to version
3.8.0
The CCME Management Host
instance type
can now be set to any x86_64 instance instead of a limited EC2 listAdded the possibility to
update
the CCME Management Host stack:Update policies integration to
deployment.ccme.conf
.CCME Management Host update policies documentation in
Management
.
Clusters
Load EnginFrame services from the
CCME/custom/ef-services.zip
instead ofCCME/pkgs/efservices-slurm.tgz
. These services are now automatically published to all users, only ifCCME_OIDC
is not used.Add support to
EC2 G5
instance family for DCV partitions.Add
CCME_EFADMIN_SUDOER
parameter to configure theefadmin
user as a sudoer.Add
CCME_AD_GROUP_SUDOER
parameter to configure users of an AD group as a sudoer.Add
update policy
to each CCME clusters parameter. Most of theCCME_
parameters can now be updated withOnNodeUpdated/Args
.Use generated passwords for the database users instead of hardcoded credentials
Add
Rocky Linux 8
support for CCME Clusters. Only with a custom AMI.
ENHANCEMENTS
CCME Roles Stack (CRS)
Allow the Management Host Stack to perform the
logs:
actions on/ccme/ccme*
instead of/aws/lambda/ccme*
.Logs of custom playbooks are now also uploaded in AWS CloudWatch loggroup
/ccme/ccme-cmh-*
.
CCME Management Host (CMH)
Updated CMH Amazon Linux 2023 AMI to version
al2023-ami-2023.3.20240131.0-kernel-6.1-x86_64
The CCME Application Load Balancer lambda log group path is changed to from
/aws/lambda/ccmeLambdaALB-${StackId}
to/ccme/ccme-lambda-alb-${StackId}
.The Windows instances use the latest version of their CMH Windows launch template
The CCME Management Host instance uses the latest version of its launch template
The CCME Management Host can no longer be used in (insecure) privileged mode. CRS roles are now mandatory.
The CCME Management Host stack no longer creates policies complementary to those dynamically created by AWS ParallelCluster
The following settings are now mandatory in the CCME management host configuration:
ccme_cluster_lambda_role
ccme_cluster_headnode_instance_profile
ccme_cluster_compute_instance_profile
Clusters
The OIDC prerequisites have been updated to the next versions:
decode:
2.4.0
pyjwt:
2.8.0
requests:
2.31.0
Documentation
Added more details about how to configure FSx for NetApp ONTAP
Added the documentation related to the CCME cluster parameter update policies
Added a link to an AWS Calculator to show an example of the cost of using CCME
Updated documentation related to the Application Load Balancer lambda log
BUG FIXES
CCME Roles Stack (CRS)
Allow the Management Host Stack to perform
secretsmanager:DescribeSecret
for cluster deployment with ActiveDirectory
CCME Management Host (CMH)
The policy
fsx:CreateDataRepositoryAssociation
is now correctly applied
Clusters
Add retries when registering EnginFrame as a client to DCVSM
Added default value to
NONE
for all variables set in/etc/ccme/ccme.env.sh
.Ensure that
{{ CCME_DEPS_DIR }}/{{ ansible_architecture }}
is created when running CCME ansible role
Release 5.3.5 - January 25, 2024
BUG FIXES
CCME Management Host (CMH)
Fixing some bugs during CMH startup on RHEL8
Release 5.3.4 - January 24, 2024
BUG FIXES
CCME Management Host (CMH)
Fixing some bugs during CMH startup on RHEL8
Clusters
Fixing usage of
CCME_WIN_INACTIVE_SESSION_TIME
,CCME_WIN_NO_SESSION_TIME
andCCME_WIN_NO_BROKER_COMMUNICATION_TIME
for Windows DCV sessions: a value of0
didn’t work to deactivate the timers.
Release 5.3.3 - January 23, 2024
BUG FIXES
CCME Management Host (CMH)
Fixing detection of Python 3.9
Release 5.3.2 - January 19, 2024
BUG FIXES
CCME Roles Stack (CRS)
Added parameter
CCMEKmsAdditionalKey
to specify an additional KMS key that will be accessible from the CMH, HeadNode and Compute nodes. (e.g., KMS key used to encrypt AMIs)
Release 5.3.1 - December 14, 2023
BUG FIXES
Updated Slurm to version 23.02.7 to fix CVE-2023-49933 to CVE-2023-49938:
Slurmd Message Integrity Bypass. CVE-2023-49935. Permits an attacker to reuse root-level authentication tokens when interacting with the slurmd process, bypassing the RPC message hashes which protect against malicious MUNGE credential reuse.
Slurm Arbitrary File Overwrite. CVE-2023-49938. Permits an attacker to modified their extended group list used with the sbcast subsystem, and open files with an incorrect set of extended groups.
Slurm NULL Pointer Dereference. CVE-2023-49936. Denial of service.
Slurm Protocol Double Free. CVE-2023-49937. Denial of service, potential for arbitrary code execution.
Slurm Protocol Message Extension. CVE-2023-49933. Allows for malicious modification of RPC traffic that bypasses the message hash checks.
Release 5.3.0 - December 13, 2023
NEW
CCME Management Host (CMH)
The parameter
MgtHostKeyName
is now optional
Clusters
Added option
CCME_EFADMIN_ID
andCCME_EFNOBODY_ID
to the clusters.Added options
CCME_WIN_LAUNCH_TRIGGER_DELAY
andCCME_WIN_LAUNCH_TRIGGER_MAX_ROUNDS
to configure Windows DCV sessions startup monitoring processes.Added an example on how to delay DCV and DCVSM services startup on Windows DCV instances to account for custom bootstrap processes.
EnginFrame logo can now be customized by adding a
logo.png
file inCCME/custom
directory.Configure
/etc/idmap.conf
to specify the domain name (when joined to an Active Directory) to allow NFSv4 mounts.
ENHANCEMENTS
Updated documentation on troubleshooting FSx for NetApp ONTAP.
BUG FIXES
Clusters
Fixed usage of
CCME_WIN_CUSTOM_CONF_REBOOT
.Fixed Windows DCV sessions when CCME was stored in a subfolder in a bucket.
dcv2
feature was defined twice ondcv*
partitions.Fixed Alb Lambda for instances concurrency at deployment.
Fixed high number of RPC calls from EnginFrame to Slurm, which caused Slurm to sometimes crash.
Release 5.2.0 - November 08, 2023
NEW
CCME Management Host (CMH)
Added option
application_load_balancer_ingress_cidr
todeployment.ccme.conf
to specify a CIDR that is allowed to connect to the ALB.
Other
Added documentation on how to use FSx for NETAPP ONTAP to share files between Linux and Windows instances.
ENHANCEMENTS
Clusters
Updated AWS ParallelCluster to version 3.6.1
Retrieve
OnNodeUpdated
arguments when executingpre-install.sh
script (OnNodeStart
) in order to allow parameter overriding when updating a cluster.
BUG FIXES
Fixed
INTERACTIVE_CLASS
name in DCV services, must contain only alphanumeric chars or-_
Fixed health check of target group associated with EnginFrame rules
Fixed
cluster-update
command: theupdate-install.sh
script ended in errorRemoved debug level on SSSD on the Management Host
Note
This version updates the following dependencies for the management host. If you are on a private
network without Internet connectivity, you must download the following packages and put them in
management/pkgs
:
{{ ansible_architecture }}/pcluster-installer-bundle-3.6.1.209-node-v16.19.0-Linux_{{ ansible_architecture }}-signed.zip
Release 5.1.0 - October 20, 2023
NEW
CCME Roles Stack (CRS)
Allow the Management Host Stack to perform action
elasticloadbalancing:ModifyLoadBalancerAttributes
on the ALB created by CCME
CCME Management Host (CMH)
CMH can now use either RHEL8 or Amazon Linux 2023. Amazon Linux 2 has been removed.
Clusters
DCV sessions are now limited by default to 1 session per session type (partition) for each user. This is set for DCV sessions running on the Headnode, any
dcv*
partition, and Windows instances. This configuration can be updated in the associated EnginFrame services.New variable configuration variables:
CCME_CUSTOM_SLURMDBD_SETTINGS
to specify additional parameters for SlurmDBD (e.g., security variables such asPrivateData
).CCME_WIN_TAGS
to specify additional tags to be set on the instances of the Windows fleet.
Other
Ansible, boto3, AWS CLI have been updated, and Python 3.9 is used on both CMH and clusters for executing Ansible playbooks. This reduces the deployment time by ~100-200s per node.
Application Load Balancer can now store its logs within a S3 named with
ccme-alb-logs-${StackId}
as prefix
ENHANCEMENTS
Cluster
Removed builtin services in EnginFrame that couldn’t be used with a CCME cluster.
Enable update of Slurm when the version deployed by AWS ParallelCluster is different from the one in
deployment.yaml
The bucket
ccme-s3bucket
created for clusters is renamedccme-cluster-${StackId}
Updated DCV to version
2023.15487
BUG FIXES
Fixed cluster deployment on aarch64 architecture
Fixed Slurm CVE-2023-41914 by updating Slurm to version
23.02.6
.
Warning
Fixing Slurm CVE-2023-41914 by updating Slurm to version 23.02.6
requires to re-compile Slurm
on the Headnode at deployment time. You need to ensure that the timeout set on the deployment time
of your Clusters is high enough to allow this. Update the AWS ParallelCluster configuration files
accordingly, such as:
DevSettings:
Timeouts:
HeadNodeBootstrapTimeout: 2400
ComputeNodeBootstrapTimeout: 1800
Release 5.0.0 - October 02, 2023
Updated EULA to version 2.3.
NEW
CCME Roles Stack (CRS)
Allow the HeadNode and ComputeNodes to perform the following actions on all ALB
elasticloadbalancing:DescribeLoadBalancerAttributes
elasticloadbalancing:DescribeListeners
elasticloadbalancing:DescribeRules
elasticloadbalancing:DescribeTags
Allow the Management Host Stack to perform the following actions
elasticloadbalancing:AddTags
on the ALB created by CCME
ec2:CreateTags
on the all network-interfaces resources
cloudformation:CreateChangeSet
CCME Management Host (CMH)
Support RedHat Enterprise Linux 8 (RHEL8) for the CCME Management Host
Add optional proxy, no_proxy and pip repository as variables for the CCME Management Host (CMH) and the clusters
Add optional custom AMI parameter
Add optional security-group parameter
Added the possibility to generate custom ParallelCluster configuration files from Jinja2 templates
Clusters
Support RedHat Enterprise Linux 8 (RHEL8) for Headnodes and Compute nodes
CCME Ansible playbooks have been refactored in an Ansible role
New visualization Windows fleet including
A Windows fleet launch template deployed by the CCME Management Host (CMH)
New variables
CCME_WIN_
availables for CCME clusters to configure the Windows fleet (AMI, instance type, configuration files…)EnginFrame
Dynamically generate an EnginFrame service for remote visualization for:
the headnode
each
dcv*
queue in SlurmWindows fleet
Renamed the DCVSM cluster name in EnginFrame from
headnode
todcvsm
, and made its hosts visible in the EnginFrame Hosts serviceAdd the possibility to automatically register all users belonging to the
CCME_EF_ADMIN_GROUP
OS group as administrators of EnginFrameAdd
CCME_EFADMIN_PASSWORD
as AWS Secret arn parameter to store the EnginFrame admin (efadmin) password for clustersAdd the possibility to encrypt all storages using multiple KMS keys at deployment with variables like
ccme_kms_
Add a custom playbook to fix Nvidia drivers CVE named
example.install-fix-nvidiacve-all.yaml
The Nvidia driver version is defined in dependencies.yaml through the parameter
nvidia_version
(set to515.48.07
)Verify Nvidia Drivers presence for CCME clusters in CCME sources, download and install if not present
Add default encryption for clusters root volumes for the cluster configuration files
CCME logs are now sent to CloudWatch
For the CMH, a new log group name
ccme-cmh-<stackID>
is created. The following logs are available:
CCME logs:
/var/log/ccme.*.log
Cloud init and cfn init logs:
/var/log/cloud-init.log
,cloud-init-output.log
,cfn-init.log
System logs:
/var/log/messages
andsyslog
SSD logs:
/var/log/sssd/sssd.log
andsssd_default.log
For each cluster, the logs are sent in the same subgroup as the log group of the cluster (see Amazon CloudWatch Logs cluster logs). The following logs are now available for each cluster:
On Head and Compute nodes:
All CCME pre/post/update Bash and Ansible scripts logs, including custom scripts
DCV logs (for Compute nodes belonging to a
dcv*
Slurm partition)On the Headnode:
EnginFrame logs
DCVSM broker and agent logs
ENHANCEMENTS
Documentation
Add Active Directory users troubleshoot section to the CCME documentation
The documentation requirements relating to the AWS network environment has been updated
Information relating to the subnets requirements are more explicit
Add specification for Internet Gateway and Network Gateway depending on multiple networks cases
The help function of the
deployCCME.sh
script is more verbose
CCME Management Host (CMH)
The resources created by the CMH stack base their name on the CMH stack id instead of CMH stack name
Upgrading AWS ParallelCluster to version
3.6.0
The parameters
ccme_bucket
andccme_bucket_subfolder
are merged to a new parameterccme_bucket_path
Clusters
Gnome initial setup is disabled on HeadNode and DCV nodes
The ALB URL can be replaced by a custom DNS name using the new
CCME_DNS
variable. This can be used to redirect both EnginFrame and DCV URLs throughCCME_DNS
.Improve the robustness and idempotency of the Ansible tasks
Upgrading EnginFrame to version
2021.0-r1667
Using native Nvidia Parallelcluster drivers (version
470.141.03
) to reduce clusters deployment by approximately 2 minutesThe name of the CCME configuration file uploaded available in
/opt/CCME/conf/
is now based on its CMH name (e.g.,CMH-Myfirstcmh.ccme.conf
)
Security
Improve the security by adding restriction on CloudFormation usage based on stack tags
Improve the CCME security by using IMDS v2 token to retrieve EC2 metadata
- Other
CRS CloudFormation template has been split into several templates to fit into template size restrictions for CloudFormation
CMH CloudFormation template has been split into several templates to fit into template size restrictions for CloudFormation
CMH stack tags are now propagated to:
CMH EBS
CMH ActiveDirectory (if created by the CMH stack)
BUG FIXES
CCME Roles Stack (CRS)
Fixed
iam:PassRole
with the parameterCustomIamPathPrefix
in the CCME Roles Stack (CRS)Fixed missing optional AWS Route53 policies in the CCME Roles Stack
Fixed
ec2:RunInstances
authorization for the computes deployment in placement-groupFixed tags associated to the CCME Roles Stack (CRS) and CCME Management Host (CMH) deployed with the
deployCCME.sh
scriptFixed CCME Management Host (CMH) ALB policy allowing to update the Application Load Balancer (ALB) certificate with
elasticloadbalancing:ModifyListener
CCME Management Host (CMH)
Fixed tag
Name
for the CCME Management Host (CMH)Fixed no multiple EBS storage on CMH
Fixed optional SNS notification at cluster deployment
The CCME Management Host (CMH) parameter
CCMESnsALB
becomesCCMEAdminSnsTopic
The default value is now
NONE
instead of*
On the Management Host
/var/log/ccme.ccme-start.log
now correctly displays the logs on individual linesFixed FSx policies (
fsx:Describe*
) for deployment with FSxOnTap storage
Clusters
Fixed JWT headers decoding when using OIDC authentication
Fixed
management_stack_role
variable description in the deployment configuration file of CCMEFixed dependencies downloading when external repositories take time to respond
Fixed url redirection of the EnginFrame logo
Fixed install of the S3FS-Fuse latest version
x86_64
(1.9*
)
aarch64
(== 1.93-1
)Fixed S3FS mount point with
aarch64
architecture and IMDs v2Fixed configuration file for ARM clusters
Fixed AWS SSM installation when a previous version is already installed
Fixed EFA usage on computes with the compute security group deployed with the CCME Management Host (CMH) stack
Fixed classic cluster configuration file
Fixed
/opt/CCME
NFS export: compute nodes can now be in different networks and AZs than the headnodeFixed mode for the CCME env file
Fixed
MariaDB
installation withaarch64
forALinux 2
OS with the next packages:
MariaDB-Server:
10.6.9-1
Galera:
4-26.4.12-1
Fixed CCME ALB Lambda policy, explicitly allowing the lambda to:
logs:CreateLogStream
logs:PutLogEvents
elasticloadbalancing:AddTags
Fixed compute egress security group
Fixed versions for pip packages
Fixed retrieval of pricing data through Slurm epilog script: use IMDSv2 to retrieve metadata
Fixed hosts status in EnginFrame when they are in
IDLE+CLOUD
Status in Slurm
Release 4.2.0 - May 17, 2023
NEW
CCME now supports in AWS region
Stockholm
(eu-north-1)AWS IAM Roles support for CCME management, lambdas and clusters
Automated deployment of the CCME Roles Stack (CRS) with
deployCCME.sh
ENHANCEMENTS
CCME dependencies packages are not required anymore
Upgrading DCV to
2023.0 (15065)
Upgrading DCVSM to
2023.0
Broker:
392-1
Agent:
675-1
BUG FIXES
Add fix to anticipate the resolution of a colord profile dialog box issue in virtual DCV sessions
Add fix to remove screenlock in Gnome screensaver settings
Add fix to force the minimize and maximize buttons to appear in the top right corner of the Windows in Gnome-based DCV sessions
Release 4.1.0 - April 21, 2023
NEW
Amazon SSM is now installed on the CCME Management Host (CMH)
ENHANCEMENTS
Separation of public and privates subnets for security improvement
The component
Application Load Balancer
is created in public subnets separated from other componentsThe components
Active Directory
,Management Host
and the clusters are now in privates subnetsUpgrading PCluster to
3.5.0
Upgrading Ansible to
4.10.0
Upgrading Pip to
23.0.1
Support usage of
IMDs v2
for CCME clusters
BUG FIXES
xorg.conf is now configured correctly for DCV on instances equipped with GPUs with HardDPMS option set to false (and option UseDisplayDevice removed)
Amazon SSM usage on clusters, including HeadNodes and ComputeNodes
Cluster time-out, including separated variable for HeadNode and ComputeNode are now configurable
Cluster update is now working correctly
The first visualization session / the first job starting dynamic node is now executed correctly after the end of node configuration
Fixed ALB rules creation for DCV nodes when lots of nodes are deployed at the same time.
EnginFrame services are not reset when the cluster is updated
Release 4.0.0 - January 24, 2023
NEW
Multiple S3 Buckets can now be mounted through S3FS
Cluster can be deployed in VPCs different than the CCME Management’s VPC
Support pre-existing AWS Application Load Balancer (ALB)
Support pre-existing Active Directory internal/external to AWS using LDAP
Support list of key:value tags for CCME Management Host and Clusters
Integration of custom scripts execution
Support optional authentication with OIDC to the EnginFrame portal
Support mounted files systems as mountpount for user home by setting the fallback_homedir option in sssd
Support timezone configuration for CMH and cluster instances
ENHANCEMENTS
Enforce TLS requirements in CCME S3 policies
Upgrading PCluster to
3.2.0
Upgrading Slurm to
21.08.8-2
Upgrading DCV to
2022.1 (13067)
Upgrading DCVSM to
2022.1
Broker:
355-1
Agent:
592-1
Upgrading EnginFrame to
2021.0-r1646
Updrading Nvidia Drivers to
515.48.07
Deploy DCV on compute nodes depending of the presence of
dcv
in partition(s) name(s)No configuration action is required to start a first pre-configured cluster
The cluster policies are now generated by the ManagementHost
The cluster component named
master
have been renamed toheadnode
Possibility to specify the authorized CIDR for the frontend ALB
Automated creation of private S3 bucket to use as the AWS ParallelCluster
CustomS3Bucket
configurationManagement Host public IP configuration can be set to
NONE
DEPRECATED
Removing CCME Command Line Interface (CCME-CLI) support
Removing Ganglia support
Release 3.0.0 - March 23, 2021
NEW
Adding a common secured balanced https entry point
EnginFrame portal
Ganglia
DCV sessions
Adding dedicated stack to deploy Management Host
Adding user centralized authentication through directory services (AD)
Secure access to cluster through selected groups in the AD
Secure access to Management Host through selected groups in the AD
Adding a CCME Command Line Interface for Management Host
Start, Stop, Update, Delete a cluster
Possibility to set a time-to-live to a cluster
Updating HeadNode so that it uses DCVSessionManager as its session viewer
Adding documentation
ENHANCEMENTS
Upgrading to AWS ParallelCluster
2.10.1
Updating Slurm to
20.02.4
Upgrading DCV to
2020.2 (9508)
Upgrading EnginFrame to
2020.0-r58
Adding option to specify on which partition(s) DCV should be deployed
DEPRECATED
Removing BeeGFS support
BUG FIXES
Fixing S3 Bucket policies