Kubernetes Guide
Contents
Overview
Kubernetes is being adopted into HPC clusters to orchestrate deployments (e.g. software, infrastructure) and run certain workloads (e.g. AI/ML inference). There is ongoing interest in integrating Kubernetes and Slurm to achieve a unified cluster, optimized resource utilization, and workflows that leverage each system.
The ways in which Slurm and Kubernetes are designed to handle certain types of workloads may change over time. Additionally, how they interact with each other may change, allowing for new possibilities. This is still an evolving area.
Presentations
Note that older presentations may contain outdated information.
Presentations from 2023
- Slurm and/or/vs Kubernetes, Tim Wickberg, SchedMD (SC23, November 2023)
- Never use Slurm HA again: Solve all your problems with Kubernetes, Chris Samuel and Doug Jacobsen, NERSC (SLUG23, November 2023)
Last modified 14 February 2024