Ticket 15722 - does sreport output shows cpu utilization of gpu nodes?
Summary: does sreport output shows cpu utilization of gpu nodes?
Status: RESOLVED INFOGIVEN
Alias: None
Product: Slurm
Classification: Unclassified
Component: Accounting (show other tickets)
Version: - Unsupported Older Versions
Hardware: Linux Linux
: 4 - Minor Issue
Assignee: Felip Moll
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2023-01-03 03:29 MST by Tana Vinod
Modified: 2023-01-03 05:27 MST (History)
2 users (show)

See Also:
Site: Cerence AI
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description Tana Vinod 2023-01-03 03:29:58 MST
Hi Team,

we are using slurm version 20.02.

Here is the output of my sreport. Does the TRES Name-CPU includes cpu utilization of gpu nodes as well? 
Is there anyway we can get cpu utilization of gpu nodes separately?

#sreport -t percent -T ALL cluster utilization

Cluster Utilization 2023-01-02T00:00:00 - 2023-01-02T23:59:59
Usage reported in Percentage of Total
--------------------------------------------------------------------------------
  Cluster      TRES Name    Allocated     Down PLND Dow          Idle Reserved      Reported
--------- -------------- ------------ -------- -------- ------------- -------- -------------
     crg2            cpu       34.20%    0.00%    0.00%        60.47%    5.33%       100.00%
     crg2            mem       28.41%    0.00%    0.00%        71.59%    0.00%       100.00%
     crg2         energy        0.00%    0.00%    0.00%         0.00%    0.00%         0.00%
     crg2        billing       34.20%    0.00%    0.00%        65.80%    0.00%       100.00%
     crg2        fs/disk        0.00%    0.00%    0.00%         0.00%    0.00%         0.00%
     crg2           vmem        0.00%    0.00%    0.00%         0.00%    0.00%         0.00%
     crg2          pages        0.00%    0.00%    0.00%         0.00%    0.00%         0.00%
     crg2       gres/gpu       64.36%    0.00%    0.00%        35.64%    0.00%       100.00%
     crg2    gres/gpu:t4       51.89%    0.00%    0.00%        48.11%    0.00%       100.00%
     crg2 gres/gpu:volta       79.33%    0.00%    0.00%        20.67%    0.00%       100.00%
     crg2 gres/gpu:ampe+       73.20%    0.00%    0.00%        26.80%    0.00%       100.00%
Comment 1 Felip Moll 2023-01-03 05:01:10 MST
(In reply to Tana Vinod from comment #0)
> Hi Team,
> 
> we are using slurm version 20.02.
> 
> Here is the output of my sreport. Does the TRES Name-CPU includes cpu
> utilization of gpu nodes as well? 
> Is there anyway we can get cpu utilization of gpu nodes separately?
> 

Hi,

The TRES "cpu" includes utilization of cpu of any node that has been allocated, be it a "gpu" node or not.
The gpu cores and gpu memory (not the physical CPU cores associated with gres.conf) are not accounted in Slurm.
In Slurm 23.05 there's a new plugin to account for gpu usage on AMD gpus, visible in commit 65e5c787ab.

Does this respond to your question?
Comment 2 Tana Vinod 2023-01-03 05:05:34 MST
Thanks Felip, for your prompt response.
Comment 3 Felip Moll 2023-01-03 05:27:31 MST
You're welcome, don't hesitate to raise new bugs if you have more questions :)