Release Notes

The following are the contents of the RELEASE_NOTES file as distributed with the Slurm source code for this release. Please refer to the NEWS include alongside the source as well for more detailed descriptions of the associated changes, and for bugs fixed within each maintenance release.

RELEASE NOTES FOR SLURM VERSION 24.05

IMPORTANT NOTES:
If using the slurmdbd (Slurm DataBase Daemon) you must update this first.

NOTE: If using a backup DBD you must start the primary first to do any
database conversion, the backup will not start until this has happened.

The 24.05 slurmdbd will work with Slurm daemons of version 23.02 and above.
You will not need to update all clusters at the same time, but it is very
important to update slurmdbd first and having it running before updating
any other clusters making use of it.

Slurm can be upgraded from version 23.02 or 23.11 to version 24.05 without loss
of jobs or other state information. Upgrading directly from an earlier version
of Slurm will result in loss of state information.

All SPANK plugins must be recompiled when upgrading from any Slurm version
prior to 24.05.

HIGHLIGHTS
==========
 -- Remove support for Cray XC ("cray_aries") systems.
 -- Federation - allow client command operation when slurmdbd is unavailable.
 -- burst_buffer/lua - Added two new hooks: slurm_bb_test_data_in and
    slurm_bb_test_data_out. The syntax and use of the new hooks are documented
    in etc/burst_buffer.lua.example. These are required to exist. slurmctld now
    checks on startup if the burst_buffer.lua script loads and contains all
    required hooks; slurmctld will exit with a fatal error if this is not
    successful. Added PollInterval to burst_buffer.conf. Removed the arbitrary
    limit of 512 copies of the script running simultaneously.
 -- Add QOS limit MaxTRESRunMinsPerAccount.
 -- Add QOS limit MaxTRESRunMinsPerUser.
 -- Add ELIGIBLE environment variable to jobcomp/script plugin.
 -- Always use the QOS name for SLURM_JOB_QOS environment variables.
    Previously the batch environment would use the description field,
    which was usually equivalent to the name.
 -- cgroup/v2 - Require dbus-1 version >= 1.11.16.
 -- Allow NodeSet names to be used in SuspendExcNodes.
 -- SuspendExcNodes=:N now counts allocated nodes in N. The first N
    powered up nodes in  are protected from being suspended.
 -- Store job output, input and error paths in SlurmDBD.
 -- Add USER_DELETE reservation flag to allow users with access to a reservation
    to delete it.
 -- Add SlurmctldParameters=enable_stepmgr to enable step management through
    the slurmstepd instead of the controller.
 -- Added PrologFlags=RunInJob to make prolog and epilog run inside the job
    extern step to include it in the job's cgroup.
 -- Add ability to reserve MPI ports at the job level for stepmgr jobs and
    subdivide them at the step level.
 -- slurmrestd - Add --generate-openapi-spec argument.

CONFIGURATION FILE CHANGES (see appropriate man page for details)
=====================================================================
 -- CoreSpecPlugin has been removed.
 -- Removed TopologyPlugin tree and dragonfly support from select/linear.
    If those topology plugins are desired please switch to select/cons_tres.
 -- Changed the default value for UnkillableStepTimeout to 60 seconds or five
    times the value of MessageTimeout, whichever is greater.
 -- An error log has been added if JobAcctGatherParams 'UsePss' or 'NoShare' are
    configured with a plugin other than jobacct_gather/linux. In such case these
    parameters are ignored.
 -- helpers.conf - Added Flags=rebootless parameter allowing feature changes
    without rebooting compute nodes.
 -- topology/block - Replaced the BlockLevels with BlockSizes in topology.conf.
 -- Add contain_spank option to SlurmdParameters. When set, spank_user_init(),
    spank_task_post_fork(), and spank_task_exit() will execute within the
    job_container/tmpfs plugin namespace.
 -- Add SlurmctldParameters=max_powered_nodes=N, which prevents powering up
    nodes after the max is reached.
 -- Add ExclusiveTopo to a partition definition in slurm.conf.
 -- Add AccountingStorageParameters=max_step_records to limit how many steps
    are recorded in the database for each job -- excluding batch, extern, and
    interactive steps.

COMMAND CHANGES (see man pages for details)
===========================================
 -- Add support for "elevenses" as an additional time specification.
 -- Add support for sbcast --preserve when job_container/tmpfs configured
    (previously documented as unsupported).
 -- scontrol - Add new subcommand 'power' for node power control.
 -- squeue - Adjust StdErr, StdOut, and StdIn output formats. These will now
    consistently print "(null)" if a value is unavailable. StdErr will no
    longer display StdOut if it is not distinctly set. StdOut will now
    correctly display the default filename pattern for job arrays, and no
    longer show it for non-batch jobs. However, the expansion patterns will
    no longer be substituted by default.
 -- Add --segment to job allocation to be used in topology/block.
 -- Add --exclusive=topo for use with topology/block.
 -- squeue - Add --expand-patterns option to expand StdErr, StdOut, StdIn
    filename patterns as best as possible.
 -- sacct - Add --expand-patterns option to expand StdErr, StdOut, StdIn
    filename patterns as best as possible.
 -- sreport - Requesting format=Planned will now return the expected Planned
    time as documented, instead of PlannedDown. To request Planned Down,
    one must use now format=PLNDDown or format=PlannedDown explicitly. The
    abbreviations "Pl" or "Pla" will now make reference to Planned instead of
    PlannedDown.

API CHANGES
===========
 -- Removed ListIterator type from .
 -- Removed slurm_xlate_job_id() from 

SLURMRESTD CHANGES
==================
 -- openapi/dbv0.0.38 and openapi/v0.0.38 plugins have been removed.
 -- openapi/dbv0.0.39 and openapi/v0.0.39 plugins have been tagged as
    deprecated to warn of their removal in the next release.
 -- Changed slurmrestd.service to only listen on TCP socket by default.
    Environments with existing drop-in units for the service may need
    further adjustments to work after upgrading.
 -- slurmrestd - Tagged `script` field as deprecated in
    'POST /slurm/v0.0.41/job/submit' in anticipation of removal in future
    OpenAPI plugin versions. Job submissions should set the `job.script` (or
    `jobs[0].script` for HetJobs) fields instead.
 -- slurmrestd - Attempt to automatically convert enumerated string arrays with
    incoming non-string values into strings. Add warning when incoming value for
    enumerated string arrays can not be converted to string and silently ignore
    instead of rejecting entire request. This change affects any endpoint that
    uses an enunmerated string as given in the OpenAPI specification. An
    example of this conversion would be to 'POST /slurm/v0.0.41/job/submit' with
    '.job.exclusive = true'. While the JSON (boolean) true value matches a
    possible enumeration, it is not the expected "true" string. This change
    automatically converts the (boolean) true to (string) "true" avoiding a
    parsing failure.
 -- slurmrestd - Add 'POST /slurm/v0.0.41/job/allocate' endpoint. This endpoint
    will create a new job allocation without any steps. The allocation will need
    to be ended via signaling the job or it will run to the timelimit.
 -- slurmrestd - Allow startup when slurmdbd is not configured and avoid loading
    slurmdbd specific plugins.

MPI/PMI2 CHANGES
================
 -- Jobs submitted with the SLURM_HOSTFILE environment variable set implies
    using an arbitrary distribution. Nevertheless, the logic used in PMI2 when
    generating their associated PMI_process_mapping values has been changed and
    will now be the same used for the plane distribution, as if "-m plane" were
    used. This has been changed because the original arbitrary distribution
    implementation did not account for multiple instances of the same host being
    present in SLURM_HOSTFILE, providing an incorrect process mapping in such
    case. This change also enables distributing tasks in blocks when using
    arbitrary distribution, which was not the case before. This only affects
    mpi/pmi2 plugin.