| Summary: | trouble configuring gpus | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Todd Merritt <tmerritt> |
| Component: | GPU | Assignee: | Director of Support <support> |
| Status: | RESOLVED INFOGIVEN | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | ||
| Version: | 19.05.6 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | U of AZ | Slinky Site: | --- |
| Alineos Sites: | --- | Atos/Eviden Sites: | --- |
| Confidential Site: | --- | Coreweave sites: | --- |
| Cray Sites: | --- | DS9 clusters: | --- |
| Google sites: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | NoveTech Sites: | --- |
| Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Tzag Elita Sites: | --- |
| Linux Distro: | --- | Machine Name: | |
| CLE Version: | Version Fixed: | ||
| Target Release: | --- | DevPrio: | --- |
| Emory-Cloud Sites: | --- | ||
|
Description
Todd Merritt
2020-07-22 08:14:51 MDT
I started slurmd with -Dvvvv and it does seem to see the gpus, so perhaps I'm just using srun incorrectly slurmd: debug3: _merge_gres2: From gres.conf, using gpu:volta:1:/dev/nvidia0 slurmd: debug3: _merge_gres2: From gres.conf, using gpu:volta:1:/dev/nvidia1 slurmd: debug3: _merge_gres2: From gres.conf, using gpu:volta:1:/dev/nvidia2 slurmd: debug3: _merge_gres2: From gres.conf, using gpu:volta:1:/dev/nvidia3 slurmd: debug3: Trying to load plugin /usr/lib64/slurm/gpu_generic.so slurmd: debug: init: GPU Generic plugin loaded slurmd: debug3: Success. slurmd: debug3: gres_device_major : /dev/nvidia0 major 195, minor 0 slurmd: debug3: gres_device_major : /dev/nvidia1 major 195, minor 1 slurmd: debug3: gres_device_major : /dev/nvidia2 major 195, minor 2 slurmd: debug3: gres_device_major : /dev/nvidia3 major 195, minor 3 slurmd: Gres Name=gpu Type=volta Count=1 slurmd: Gres Name=gpu Type=volta Count=1 slurmd: Gres Name=gpu Type=volta Count=1 slurmd: Gres Name=gpu Type=volta Count=1 I saw the error of my ways. I was accidentally asking for two nodes and I only have one gpu node. |