| Summary: | Follow up to GPU cgroup discussion at SC | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Steve Ford <fordste5> |
| Component: | slurmd | Assignee: | Marshall Garey <marshall> |
| Status: | RESOLVED INFOGIVEN | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | ||
| Version: | 18.08.1 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| See Also: | https://bugs.schedmd.com/show_bug.cgi?id=6253 | ||
| Site: | MSU | Slinky Site: | --- |
| Alineos Sites: | --- | Atos/Eviden Sites: | --- |
| Confidential Site: | --- | Coreweave sites: | --- |
| Cray Sites: | --- | DS9 clusters: | --- |
| Google sites: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | NoveTech Sites: | --- |
| Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Tzag Elita Sites: | --- |
| Linux Distro: | --- | Machine Name: | |
| CLE Version: | Version Fixed: | ||
| Target Release: | --- | DevPrio: | --- |
| Emory-Cloud Sites: | --- | ||
Yes, this is expected. At one point in time, the way constraining devices in cgroups works changed. You'll find some helpful background here: https://bugs.schedmd.com/show_bug.cgi?id=5361 https://www.kernel.org/doc/Documentation/cgroup-v1/devices.txt The second paragraph: "The root device cgroup starts with rwm to 'all'. A child device cgroup gets a copy of the parent. Administrators can then remove devices from the whitelist or add new entries. A child cgroup can never receive a device access which is denied by its parent." The last paragraph: "device cgroups is implemented internally using a behavior (ALLOW, DENY) and a list of exceptions. The internal state is controlled using the same user interface to preserve compatibility with the previous whitelist-only implementation. Removal or addition of exceptions that will reduce the access to devices will be propagated down the hierarchy. For every propagated exception, the effective rules will be re-evaluated based on current parent's access rules." It sounds like there was a previous behavior ("previous whitelist-only implementation"). But with the new behavior, the cgroup_allowed_devices file doesn't do anything anymore. We haven't yet included the contribution in bug 5361. There is additional work that needs to be done to cleanup the task/cgroup plugin. Currently the plugin whitelists everything in the cgroup_allowed_devices file (which isn't needed). Then it whitelists any GRES the job has in its allocation (also not needed), and blacklists every GRES the job does not have in its allocation (this is how devices are constrained). Does that answer your question? Closing as info given. |
Hello, This is a follow up to a discussion I had at SC with Tim about constraining GPU devices using cgroups. I was mistaken about this behavior not being supported and was able to to constrain GPUs after setting "ConstrainDevices=yes" in my cgroup.conf file. There is, however, some odd behavior surrounding the task cgroups that led me to think GPUs were not being constrained. When I request a GPU and examine the cgroup for the running task, the divices.list file shows the cgroup has access to all devices: $ cat /sys/fs/cgroup/devices/slurm/uid_885046/job_2065332/step_5/devices.list a *:* rwm The devices.list file is the same for all cgroups in the devices hierarchy: $ find /sys/fs/cgroup -name devices.list -exec cat {} \; | uniq -c 93 a *:* rwm I know this isn't a SLURM issue, but I'm curious if this is expected given the way SLURM constrains GPU devices. Any insight you have is greatly appreciated. Thanks, Steve