Release Notes
The following are the contents of the RELEASE_NOTES file as distributed with the Slurm source code for this release. Please refer to the NEWS include alongside the source as well for more detailed descriptions of the associated changes, and for bugs fixed within each maintenance release.
RELEASE NOTES FOR SLURM VERSION 24.05 IMPORTANT NOTES: If using the slurmdbd (Slurm DataBase Daemon) you must update this first. NOTE: If using a backup DBD you must start the primary first to do any database conversion, the backup will not start until this has happened. The 24.05 slurmdbd will work with Slurm daemons of version 23.02 and above. You will not need to update all clusters at the same time, but it is very important to update slurmdbd first and having it running before updating any other clusters making use of it. Slurm can be upgraded from version 23.02 or 23.11 to version 24.05 without loss of jobs or other state information. Upgrading directly from an earlier version of Slurm will result in loss of state information. All SPANK plugins must be recompiled when upgrading from any Slurm version prior to 24.05. HIGHLIGHTS ========== -- Remove support for Cray XC ("cray_aries") systems. -- Federation - allow client command operation when slurmdbd is unavailable. -- burst_buffer/lua - Added two new hooks: slurm_bb_test_data_in and slurm_bb_test_data_out. The syntax and use of the new hooks are documented in etc/burst_buffer.lua.example. These are required to exist. slurmctld now checks on startup if the burst_buffer.lua script loads and contains all required hooks; slurmctld will exit with a fatal error if this is not successful. Added PollInterval to burst_buffer.conf. Removed the arbitrary limit of 512 copies of the script running simultaneously. -- Add QOS limit MaxTRESRunMinsPerAccount. -- Add QOS limit MaxTRESRunMinsPerUser. -- Add ELIGIBLE environment variable to jobcomp/script plugin. -- Always use the QOS name for SLURM_JOB_QOS environment variables. Previously the batch environment would use the description field, which was usually equivalent to the name. -- cgroup/v2 - Require dbus-1 version >= 1.11.16. -- Allow NodeSet names to be used in SuspendExcNodes. -- SuspendExcNodes=:N now counts allocated nodes in N. The first N powered up nodes in are protected from being suspended. -- Store job output, input and error paths in SlurmDBD. -- Add USER_DELETE reservation flag to allow users with access to a reservation to delete it. -- Add SlurmctldParameters=enable_stepmgr to enable step management through the slurmstepd instead of the controller. -- Added PrologFlags=RunInJob to make prolog and epilog run inside the job extern step to include it in the job's cgroup. -- Add ability to reserve MPI ports at the job level for stepmgr jobs and subdivide them at the step level. -- slurmrestd - Add --generate-openapi-spec argument. CONFIGURATION FILE CHANGES (see appropriate man page for details) ===================================================================== -- CoreSpecPlugin has been removed. -- Removed TopologyPlugin tree and dragonfly support from select/linear. If those topology plugins are desired please switch to select/cons_tres. -- Changed the default value for UnkillableStepTimeout to 60 seconds or five times the value of MessageTimeout, whichever is greater. -- An error log has been added if JobAcctGatherParams 'UsePss' or 'NoShare' are configured with a plugin other than jobacct_gather/linux. In such case these parameters are ignored. -- helpers.conf - Added Flags=rebootless parameter allowing feature changes without rebooting compute nodes. -- topology/block - Replaced the BlockLevels with BlockSizes in topology.conf. -- Add contain_spank option to SlurmdParameters. When set, spank_user_init(), spank_task_post_fork(), and spank_task_exit() will execute within the job_container/tmpfs plugin namespace. -- Add SlurmctldParameters=max_powered_nodes=N, which prevents powering up nodes after the max is reached. -- Add ExclusiveTopo to a partition definition in slurm.conf. -- Add AccountingStorageParameters=max_step_records to limit how many steps are recorded in the database for each job -- excluding batch, extern, and interactive steps. COMMAND CHANGES (see man pages for details) =========================================== -- Add support for "elevenses" as an additional time specification. -- Add support for sbcast --preserve when job_container/tmpfs configured (previously documented as unsupported). -- scontrol - Add new subcommand 'power' for node power control. -- squeue - Adjust StdErr, StdOut, and StdIn output formats. These will now consistently print "(null)" if a value is unavailable. StdErr will no longer display StdOut if it is not distinctly set. StdOut will now correctly display the default filename pattern for job arrays, and no longer show it for non-batch jobs. However, the expansion patterns will no longer be substituted by default. -- Add --segment to job allocation to be used in topology/block. -- Add --exclusive=topo for use with topology/block. -- squeue - Add --expand-patterns option to expand StdErr, StdOut, StdIn filename patterns as best as possible. -- sacct - Add --expand-patterns option to expand StdErr, StdOut, StdIn filename patterns as best as possible. -- sreport - Requesting format=Planned will now return the expected Planned time as documented, instead of PlannedDown. To request Planned Down, one must use now format=PLNDDown or format=PlannedDown explicitly. The abbreviations "Pl" or "Pla" will now make reference to Planned instead of PlannedDown. API CHANGES =========== -- Removed ListIterator type from . -- Removed slurm_xlate_job_id() from SLURMRESTD CHANGES ================== -- openapi/dbv0.0.38 and openapi/v0.0.38 plugins have been removed. -- openapi/dbv0.0.39 and openapi/v0.0.39 plugins have been tagged as deprecated to warn of their removal in the next release. -- Changed slurmrestd.service to only listen on TCP socket by default. Environments with existing drop-in units for the service may need further adjustments to work after upgrading. -- slurmrestd - Tagged `script` field as deprecated in 'POST /slurm/v0.0.41/job/submit' in anticipation of removal in future OpenAPI plugin versions. Job submissions should set the `job.script` (or `jobs[0].script` for HetJobs) fields instead. -- slurmrestd - Attempt to automatically convert enumerated string arrays with incoming non-string values into strings. Add warning when incoming value for enumerated string arrays can not be converted to string and silently ignore instead of rejecting entire request. This change affects any endpoint that uses an enunmerated string as given in the OpenAPI specification. An example of this conversion would be to 'POST /slurm/v0.0.41/job/submit' with '.job.exclusive = true'. While the JSON (boolean) true value matches a possible enumeration, it is not the expected "true" string. This change automatically converts the (boolean) true to (string) "true" avoiding a parsing failure. -- slurmrestd - Add 'POST /slurm/v0.0.41/job/allocate' endpoint. This endpoint will create a new job allocation without any steps. The allocation will need to be ended via signaling the job or it will run to the timelimit. -- slurmrestd - Allow startup when slurmdbd is not configured and avoid loading slurmdbd specific plugins. MPI/PMI2 CHANGES ================ -- Jobs submitted with the SLURM_HOSTFILE environment variable set implies using an arbitrary distribution. Nevertheless, the logic used in PMI2 when generating their associated PMI_process_mapping values has been changed and will now be the same used for the plane distribution, as if "-m plane" were used. This has been changed because the original arbitrary distribution implementation did not account for multiple instances of the same host being present in SLURM_HOSTFILE, providing an incorrect process mapping in such case. This change also enables distributing tasks in blocks when using arbitrary distribution, which was not the case before. This only affects mpi/pmi2 plugin.