| Summary: | Enhance support for AMD GPUs and APIs | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Tim Wickberg <tim> |
| Component: | GPU | Assignee: | Tim Wickberg <tim> |
| Status: | RESOLVED DUPLICATE | QA Contact: | |
| Severity: | 5 - Enhancement | ||
| Priority: | --- | CC: | bertsch2, day36, ezellma |
| Version: | 20.02.x | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| See Also: | https://bugs.schedmd.com/show_bug.cgi?id=7714 | ||
| Site: | CRAY | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | Nazare |
| Coreweave sites: | --- | Cray Sites: | Cray Internal |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | Target Release: | 20.11 | |
| DevPrio: | 1 - Paid | Emory-Cloud Sites: | --- |
|
Description
Tim Wickberg
2019-08-12 22:17:45 MDT
Just tidying up. I'm marking this as complete - the gpu/rsmi plugin has been available since the 20.02 release last year as is working as intended. *** This ticket has been marked as a duplicate of ticket 7714 *** Opening this ticket up publicly, and adding a couple of documentation links: AMD's ROCm SMI library is what the Slurm gpu/rsmi plugin depends on for device info: https://github.com/RadeonOpenCompute/rocm_smi_lib The rsmi.h header itself is the best description of the API they've defined: https://github.com/RadeonOpenCompute/rocm_smi_lib/blob/master/include/rocm_smi/rocm_smi.h |