| Summary: | sreport cluster UserUtilizationByaccount versus sreport job SizesByAccount : inconsistencies | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | |
| Component: | User Commands | Assignee: | Jacob Jenson <jacob> |
| Status: | RESOLVED INVALID | QA Contact: | |
| Severity: | 6 - No support contract | ||
| Priority: | --- | ||
| Version: | 16.05.2 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | -Other- | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | Target Release: | --- | |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
Hello, I am currently trying to monitor the CPU usage of users of a cluster running slurm. I found 3 ways to do this but I got inconsistencies. If I use 'sreport cluster UserUtilizationByAccount' (see actual command below), the column 'Used' give me some numbers (even for account no more associated to my user: 'grpdel'). If I use 'sreport job SizesByAccount' (see actual command below): - the account 'grpdel' does not appear this time - I obtain the same value for grp001 - I obtain a smaller value for grp002 So the same binary does not seem to report the same values depending on how we ask. Am I missing something here ? I then check those values with sacct (see actual commands below) and obtain the same numbers as with 'sreport job SizesByAccount'. My question is : which values are correct, meaning which values are checked against the limits imposed by QOS (or partitions or account) ? Thanks for your time, Cyril $ sreport -t Seconds cluster UserUtilizationByAccount Users=username start=2001-01-01 end=2100-01-01 -------------------------------------------------------------------------------- Cluster/User/Account Utilization 2001-01-01T00:00:00 - 2017-09-14T22:59:59 (527119200 secs) Use reported in TRES Seconds -------------------------------------------------------------------------------- Cluster Login Proper Name Account Used Energy --------- --------- --------------- --------------- ---------- -------- cluster username XXXXX XXXXXXX grpdel 4798042480 0 cluster username XXXXX XXXXXXX grp001 183102536 0 cluster username XXXXX XXXXXXX grp002 6134353 0 $ sreport -t Seconds job SizesByAccount Users=username start=2001-01-01 end=2100-01-01 grouping=1000 -------------------------------------------------------------------------------- Job Sizes 2001-01-01T00:00:00 - 2017-09-14T22:59:59 (527119200 secs) Time reported in Seconds -------------------------------------------------------------------------------- Cluster Account 0-999 CPUs >= 1000 CPUs % of cluster --------- --------- ------------- ------------- ------------ cluster grp002 5406909 0 2.87% cluster grp001 183102536 0 97.13% $ sacct -X -S 2001-01-01 -E 2100-01-01 -A grp001 --noheader -o CPUTimeRaw | awk '{sum+=$1} END {print sum}' 183102536 $ sacct -X -S 2001-01-01 -E 2100-01-01 -A grp002 --noheader -o CPUTimeRaw | awk '{sum+=$1} END {print sum}' 5406909