
Important: Changes to shared and non-shared partitions
Summary of change
On January 22, after the clusters were brought online following the power outage (see January newsletter), we updated all clusters to enable node sharing by default, disabling partitions ending in "-shared." This consolidation will lower system complexity by greatly reducing the number of partitions and making all clusters consistent with the behavior of Granite. This may require updates to your job submission parameters, particularly if you use non-shared partitions (e.g., "notchpeak" in place of "notchpeak-shared") without explicit CPU core and memory requests.
If you need to use a full node, specify "--mem=0 --exclusive" with your job submission parameters (e.g., "#SBATCH --mem=0" and "#SBATCH --exclusive" in Slurm scripts).
Details of change
For historical reasons, dating to an age when there were relatively few CPU cores per node, older General Environment (GE) clusters (Lonepeak, Kingspeak, and Notchpeak) and the Protected Environment (PE) cluster (Redwood) have two sets of partitions. The default partition—e.g., "notchpeak" or "notchpeak-guest"—allocates the whole node to the job, regardless of the quantity of resources requested, while the shared partition—e.g., "notchpeak-shared" or "notchpeak-shared-guest"—allocates only the resources requested, allowing node sharing among jobs on a single node.
This setup is unnecessarily complex and inefficient in the current age of many-core CPUs, which is why we departed from this model for the Granite cluster, where node sharing is the default behavior. We are planning to change the default behavior of all clusters to enable node sharing by default and consolidate partitions. This will bring several advantages, including a shorter and simpler cluster partition list and a reduction in resource underutilization (non-shared jobs often do not use all resources on a node).
On January 22, after the clusters are brought online following the power outage (see January newsletter), we will be restarting Slurm, changing the default partition from non-shared to shared. Any jobs submitted after this change will need to use the default partition (e.g., "notchpeak" or "notchpeak-guest"). The previous shared partition (e.g., "notchpeak-shared" or "notchpeak-shared-guest") will be drained; it will allow existing jobs to run and finish but will not accept new jobs. This change will not require downtime; there will be just a short delay in job submission when Slurm is restarted.
Following this change, users of existing shared partitions will need to modify their scripts, including workflows or automated job submission/monitoring scripts, to remove the "shared" word from the partition name (e.g., replace "notchpeak-shared" with "notchpeak").
Users who run jobs in the existing non-shared partition will need to ensure that their resource requests are accurate and correspond to their jobs' requirements. At present, jobs submitted to non-shared partitions are allocated whole nodes, so programs may currently use more than the resources requested in a Slurm script (e.g., if a Slurm script requests only one core but specifies a non-shared partition, a node with many more cores will be allocated). This will not be the case for the new model, where resource requests (e.g., specifications of CPU core counts and memory) must be accurate. Users are encouraged to analyze their jobs' actual resource usage via their CHPC Portal Recent Slurm jobs page and ask for accurate resources with "--ntasks" for the number of CPU cores and "--mem" for the quantity memory when submitting jobs to the scheduler, Slurm. It is preferred to ask for the cores and memory your job needs to improve the utilization of our resources. To ask for the full node's memory, use "--exclusive --mem=0"—which would allocate all resources on a node, equivalent to the current non-shared setup. If you have any questions or concerns about how this will affect your work, please contact us at [email protected].
Examples
Prior to January 22, 2026
| Cluster | Example of exclusive partition | Example of shared partition |
|---|---|---|
| granite | N/A | granite |
| notchpeak | notchpeak | notchpeak-shared |
| kingspeak | kingspeak | kingspeak-shared |
| lonepeak | lonepeak | lonepeak-shared |
| redwood | redwood | redwood-shared |
After January 22, 2026
| Cluster | Example of exclusive partition | Example of shared partition |
|---|---|---|
| granite | N/A | granite |
| notchpeak | N/A | notchpeak |
| kingspeak | N/A | kingspeak |
| lonepeak | N/A | lonepeak |
| redwood | N/A | redwood |
It is still possible to request all of the resources on a node when node sharing is enabled (with "#SBATCH --mem=0" and "#SBATCH --exclusive").