Hello, For a running job we could see what exactly CPUs/GPUs/etc is allocated. E.g. [root@DCALPH000 ~]# scontrol show -dd jobid=1360317 | grep IDs Nodes=dcalph134 CPU_IDs=24 Mem=0 GRES_IDX=gpu:p100(IDX:0) [root@DCALPH000 ~]# We believe that access to the same information will be useful for already completed jobs. 1) That will allow to reconstruct what really happened on cluster at any given time (time interval). 1.1) For example. If we'd like to research preemptor / preemptee relation. Let me explain: if we see overlap of the start,end pair for some job (having different PriorityTier) and see that these jobs did share some nodes - there are still no sufficient evidences to conclude about preemption as it is not clear if there was any overlap in terms of e.g. CPU cores allocated on each of the nodes in question. 1.2) If we trying to calculate core seconds usage for each user per interval (say for each 10 minutes) we need to know exactly whether each jobs was running or suspended (preempted) inside that interval. And if job was suspended - then for how long it was preempted within this particular interval 2) For GPU we have already setup server level DCGM/prometheus/graphana dashboard. And we see now historical metrics for each GPU of any server. But job centric presentation as still impossible n as Slurm does not save allocated GPU IDs. Of course we could still look into our own custom completion plugin(s). But we'd like to avoid any potential duplication of efforts. So even if SchedMD is not planning to look into this in a foreseeable future it would be at least interesting to understand where: slurm accounting db?/elastic search completion plugin?/somewhere else it seems logical to SchedMD to store information about exact CPU/GPU assignments.
Hi Sergey Meirovich, This is an interesting idea but at this time there are no current plans to tackle these changes.
Hello Jason, Thanks for you answer. Could you please look into second part of my question? "... So even if SchedMD is not planning to look into this in a foreseeable future it would be at least interesting to understand where: slurm accounting db?/elastic search completion plugin?/somewhere else it seems logical to SchedMD to store information about exact CPU/GPU assignments." ?
Hi Sergey Meirovich, > "... So even if SchedMD is not planning to look into this in a foreseeable future it would be at least interesting to understand where: slurm accounting db?/elastic search completion plugin?/somewhere else it seems logical to SchedMD to store information about exact CPU/GPU assignments." Slurm does not currently document task placement / gpu placement in the accounting database. It does give an overview of what was used. e.g. jason@nh-grey:~/slurm/master$ sacct -j 280 -o JobID,AllocGRES,AllocCPUS JobID AllocGRES AllocCPUS ------------ ------------ ---------- 280 gpu:0 2 280.batch gpu:0 2 280.extern gpu:0 2 Note that you can run "scontrol show job -d <job_id>", and query some more information in that output.. ... Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=* Nodes=m1 CPU_IDs=0-1 Mem=0 GRES=gpu(IDX:0) ... An epilogctld, "EpilogSlurmctld, may be able to capture this job's comment after the job is done. https://slurm.schedmd.com/prolog_epilog.html
Rolling this into one ticket. *** This ticket has been marked as a duplicate of ticket 2047 ***