AWS ParallelCluster v2.5.0
We're excited to announce the release of AWS ParallelCluster 2.5.0.
Upgrade
How to upgrade?
sudo pip install --upgrade aws-parallelcluster
Enhancements
- Add support for new OS: Ubuntu 18.04
- Add support for AWS Batch scheduler in China partition and in
eu-north-1
. - Revamped
pcluster configure
command which now supports automated networking configuration. - Add support for NICE DCV on Centos 7 to setup a graphical remote desktop session on the Master node.
- Add support for new EFA supported instances:
c5n.metal
,m5dn.24xlarge
,m5n.24xlarge
,r5dn.24xlarge
,r5n.24xlarge
- Add support for scheduling with GPU options in Slurm. Currently supports the following GPU-related options:
-G/--gpus, --gpus-per-task, --gpus-per-node, --gres=gpu, --cpus-per-gpu
.
Integrated GPU requirements into scaling logic, cluster will scale automatically to satisfy GPU/CPU requirements for pending jobs. When submitting GPU jobs, CPU/node/task information is not required but preferred in order to avoid ambiguity. If only GPU requirements are specified, cluster will scale up to the minimum number of nodes required to satisfy all GPU requirements. - Add new cluster configuration option to automatically disable Hyperthreading (
disable_hyperthreading = true
) - Install Intel Parallel Studio 2019.5 Runtime in Centos 7 when
enable_intel_hpc_platform = true
and share/opt/intel
over NFS - Additional EC2 IAM Policies can now be added to the role ParallelCluster automatically creates for cluster nodes by simply specifying
additional_iam_policies
in the cluster config.
Changes
- Ubuntu 14.04 is no longer supported
- Upgrade Intel MPI to version U5.
- Upgrade EFA Installer to version 1.7.0, this also upgrades Open MPI to 4.0.2.
- Upgrade NVIDIA driver to Tesla version 418.87.
- Upgrade CUDA library to version 10.1.
- Upgrade Slurm to version 19.05.3-2.
- Install EFA in China AMIs.
- Increase default EBS volume size from 17GB to 25GB
- FSx Lustre now supports new storage_capacity options 1,200 and 2,400 GiB
- Enable
flock user_xattr noatime
Lustre mount options by default everywhere and
x-systemd.automount x-systemd.requires=lnet.service
for systemd based systems. - Increase the number of hosts that can be processed by scaling daemons in a single batch from 50 to 200. This improves the scaling time especially with increased ASG launch rates.
- Change default sshd config in order to disable X11 forwarding and update the list of supported ciphers.
- Increase faulty node termination timeout from 1 minute to 5 in order to give some additional time to the scheduler to recover when under heavy load.
- Extended
pcluster createami
command to specify the VPC and network settings when building the AMI. - Support inline comments in config file
- Support Python 3.8 in pcluster CLI.
- Deprecate Python 2.6 support
- Add
ClusterName
tag to EC2 instances. - Search for new available version only at
pcluster create
action. - Enable
sanity_check
by default.
Bug Fixes
- Fix sanity check for custom ec2 role. Fixes #1241.
- Fix bug when using same subnet for both master and compute.
- Fix bug when ganglia is enabled ganglia urls are shown. Fixes #1322.
- Fix bug with
awsbatch
scheduler that prevented Multi-node jobs from running. - Fix jobwatcher behaviour that was marking nodes locked by the nodewatcher as busy even if they had been removed already from the ASG Desired count. This was causing, in rare circumstances, a cluster overscaling.
- Fix bug that was causing failures in sqswatcher when ADD and REMOVE event for the same host are fetched together.
- Fix bug that was preventing nodes to mount partitioned EBS volumes.
- Implement paginated calls in
pcluster list
. - Fix bug when creating
awsbatch
cluster with name longer than 31 chars - Fix a bug that lead to ssh not working after ssh'ing into a compute node by ip address.
Support
Need help / have a feature request?
AWS Support: https://console.aws.amazon.com/support/home
ParallelCluster Issues tracker on GitHub: https://github.com/aws/aws-parallelcluster
The HPC Forum on the AWS Forums page: https://forums.aws.amazon.com/forum.jspa?forumID=192