Containers Guide
Containers are new and generally exciting development in HPC workloads. Containers rely on existing kernel features to allow greater user control over what applications see and can interact with at any given time. For HPC Workloads, these are usually restricted to the mount namespace. Slurm allows container developers to create SPANK Plugins that can be called at various points of job execution to support containers. Slurm is generally agnostic to containers and can be made to start most, if not all, types.
Links to several container varieties are provided below:
- Charliecloud
- Docker
- UDOCKER
- Rootless Docker
- Kubernetes Pods (k8s)
- Shifter
- Singularity
- ENROOT
- Podman
- Sarus
Container Types
Charliecloud
Charliecloud is user namespace container system sponsored by LANL to provide HPC containers. Charliecloud supports the following:
- Directly called by users via user namespace support.
- Direct Slurm support currently in development.
- OCI Image support (via wrapper)
Docker (running as root)
Docker currently has multiple design points that make it unfriendly to HPC systems. The issue that usually stops most sites from using Docker is the requirement of "only trusted users should be allowed to control your Docker daemon" [Docker Security] which is not acceptable to most HPC systems.
Sites with trusted users can add them to the docker Unix group and allow them control Docker directly from inside of jobs. There is currently no direct support for starting or stopping docker containers in Slurm.
UDOCKER
UDOCKER is Docker feature subset clone that is designed to allow execution of docker commands without increased user privileges.
Rootless Docker
Rootless Docker (>=v20.10) requires no extra permissions for users and currently (as of January 2021) has no known security issues with users gaining privileges. Each user will need to run an instance of the dockerd server on each node of the job in order to use docker. There are currently no helper scripts or plugins for Slurm to automate the build up or tear down the docker daemons.
Kubernetes Pods (k8s)
Kubernetes is a container orchestration system that uses PODs, which are generally a logical grouping of containers for singular purpose.
There is currently no support for Kubernetes Pods in Slurm. Users wishing to run OCI images contained in Pods via Slurm might consider one of the following instead:
Kubernetes requires root privileges but users could consider using rootless Kubernetes inside of jobs:
Shifter
Shifter is a container project out of NERSC to provide HPC containers with full scheduler integration.
- Shifter provides full instructions to integrate with Slurm.
- Presentations about Shifter and Slurm:
Singularity
Singularity is hybrid container system that supports:
- Slurm integration (for singularity v2.x) via Plugin. A full description of the plugin was provided in the SLUG17 Singularity Presentation.
- User namespace containers via sandbox mode that require no additional permissions.
- Users directly calling singularity via setuid executable outside of Slurm.
ENROOT
Enroot is a user namespace container system sponsored by NVIDIA that supports:
- Slurm integration via pyxis
- Native support for Nvidia GPUs
- Faster Docker image imports
Podman
Podman is a user namespace container system sponsored by Redhat/IBM that supports:
- Drop in replacement of Docker.
- Called directly by users. (Currently lacks direct Slurm support).
- Rootless image building via buildah
- Native OCI Image support
Sarus
Sarus is a privileged container system sponsored by ETH Zurich CSCS that supports:
- Slurm image synchronization via OCI hook
- Native OCI Image support
- NVIDIA GPU Support
- Similar design to Shifter
Last modified 21 January 2021