Created attachment 14774 [details] sacct -P -j 11021559 With the recent enhancements you GPU requests for sbatch / srun, it's no longer possible to know which nodes were allocated a specific GPU count from the accounting records. For example, the command: $ srun -G 100 -t 1 --pty bash -l Will result in 100 GPUs being allocated, each with one CPU core. In one of our tests the GPUs were allocated across 19 nodes, however the information about the GPU allocations is incorrect: $ sacct -P -j 11021559 --format=AllocTRES AllocTRES billing=19,cpu=19,gres/gpu=100,mem=38G,node=19 billing=19,cpu=19,gres/gpu=100,mem=38G,node=19 cpu=19,gres/gpu:gtx1080ti=100,gres/gpu=100,mem=0,node=19 (I've included the full job sacct output of the job as a CSV file) In one of the records it is indicated gres/gpu:gtx1080ti=100 however we only have 64 x 1080ti in our cluster. It would be desirable to have individual allocations for each node listed. -greg
Greg, This is again bug 8024. See comment 13. We are aware of the issue and plan on fixing it eventually. It has proven tricky so we cannot guarantee when it will be fixed. We will let you know when we have a fix available. While this bug is still here you could try to look at nodelist to determine which gpus were actually allocated. Good luck, Scott *** This ticket has been marked as a duplicate of ticket 8024 ***