Releases: oracle-quickstart/oci-hpc
Releases · oracle-quickstart/oci-hpc
v2.10.3
What's Changed
- Support for OL8 on bastion
- Support for compute Clusters
- Add GPU monitoring
- Support for Hyperthreading of 256 threads+ nodes in SLURM
- Add IB Write tests
- Mount multiple disks as one (with or without redundancy)
- Bug Fixes and improvements
v2.10.2.1
What's Changed
- Ubuntu support for PAM
- Update of oci-cn-auth in case the image has outdated one
- Update some default variables.
v2.10.2
What's Changed
- Updated to Slurm 23.02 (Which remove the need for node ordering in large GPU clusters)
- Updated marketplace images for OL7, OL8, and GPU with version 2.1.4 of the OCI authentication packages (Needed for better perf for GPU clusters).
- Fixed LDAP on Ubuntu
- Added the option to mount all NVMe's as separate Namespaces or One Logical volume (With or without redundancy)
- Added Hyperthreading for Ubuntu BMs
- Support for PMIx in Slurm
- Fix a Slurm bug due to long Rack IDs
- Other Small bug fixes
v2.10.1.1
What's Changed
Fix bug about bastion and login node Flex Shapes
v2.10.1
What's Changed
- Slurm User limits and PAM
- Updated marketplace images for OL7, OL8 and GPU with latest drivers.
- Support for the upcoming E5, A10 VMs and Dense.E4.Flex
- Add the ability to run a login node separate from the bastion.
- OCI provider version to 4.112.0
- Other Small bug fixes
v2.10.0
- Add support for private endpoints for deployment without public IP
- Support for Ubuntu (Bastion and compute)
- New marketplace images
- Fix hyperthreading flag not honored on the initial cluster.
- Slurm v22.04
- Better support for unreachable nodes
- OCI provider version to 4.99.0
2.9.2
- Better support for resizing of clusters. Change hostnames in Slurm
- Small jobs can run on large cluster and resize the cluster down
- Generate a privilege group
- Pyxis and Enroot containers
- Allow the home directory on FSS
- RDMA latency check during autoscaling
- OCI provider version to 4.79.0
v2.8.0.5
- Pin ansible.netcommon to 2.5.1 due to bug introduced in ansible.utils
v2.8.0.2
What's Changed
- Fix issue about cloud agent in autoscaling by @arnaudfroidmont
v2.8.0.1
Changes since 2.8.0
Fixed defaults for the API key variable to an empty string.
Changes since 2.7.2
Major update to the autoscaling system. Please refer to README.md Autoscaling section.
Updated schema to avoid missing but required fields.
The API key can now be uploaded instead of paste.
RDMA subnet address can be changed from the stack menu (make sure it is the same size as private/cluster subnet).
Slurm monitoring - can be used with a local database on bastion or MySQL service.
Bastion will now use HPC images by default.
Hyperthreading settings should be persistent across reboots now.
Fixed issue with block volume attachment when initial cluster size is 0.
Corrected freeform tags.
Autoscaling will tag resources now.
ARM shapes are now allowed (not supported by HPC image, please use platform image).
Updated image list.