Created attachment 11131 [details] slurm.conf I'm sure this is a configuration problem somewhere but I'm not seeing it. I have a small cluster set up on Centos7 with fair share accounting configured. However, after a large number of jobs have run I'm not seeing any usage accruing in sshare: bash-4.2$ sshare Account User RawShares NormShares RawUsage EffectvUsage FairShare -------------------- ---------- ---------- ----------- ----------- ------------- ---------- root 1.000000 0 1.000000 0.500000 webservice 1024 0.500000 0 0.000000 1.000000 rast 512 0.250000 0 0.000000 1.000000 seed 512 0.250000 0 0.000000 1.000000 The jobs are all run with an account specified with -A. I have JobAcctGatherType=jobacct_gather/linux (slurm.config also attached). I'm not sure what would be useful to add to help track this down; let me know. Thank you! Bob
Digging further. We have lots of data in the accounting database: mysql> select * from maas_assoc_usage_day_table limit 10; +---------------+------------+---------+----+---------+------------+------------+ | creation_time | mod_time | deleted | id | id_tres | time_start | alloc_secs | +---------------+------------+---------+----+---------+------------+------------+ | 1561093200 | 1561093200 | 0 | 0 | 1 | 1561006800 | 1515 | | 1561179600 | 1561194000 | 0 | 0 | 1 | 1561093200 | 3249006 | | 1561266000 | 1561276800 | 0 | 0 | 1 | 1561179600 | 6220434 | | 1561352400 | 1561352400 | 0 | 0 | 1 | 1561266000 | 5295102 | | 1561438800 | 1561446000 | 0 | 0 | 1 | 1561352400 | 6078030 | | 1561525200 | 1561532400 | 0 | 0 | 1 | 1561438800 | 6220494 | | 1561611600 | 1561626000 | 0 | 0 | 1 | 1561525200 | 6220620 | | 1561698000 | 1561701600 | 0 | 0 | 1 | 1561611600 | 8992801 | | 1561784400 | 1561784400 | 0 | 0 | 1 | 1561698000 | 14694290 | | 1561870800 | 1561870800 | 0 | 0 | 1 | 1561784400 | 6543524 | +---------------+------------+---------+----+---------+------------+------------+ 10 rows in set (0.00 sec) mysql> select count(*) from maas_assoc_usage_day_table ; +----------+ | count(*) | +----------+ | 192 | +----------+ 1 row in set (0.04 sec) However, all entries are id=0. From a read of the source, that id should correspond to ids in the cluster assoc_table. There we don't have an id=0: mysql> select id_assoc, user, acct, parent_acct from maas_assoc_table; +----------+------+------------+-------------+ | id_assoc | user | acct | parent_acct | +----------+------+------------+-------------+ | 1 | | root | | | 2 | root | root | | | 3 | | webservice | root | | 4 | | seed | webservice | | 5 | | rast | webservice | +----------+------+------------+-------------+ 5 rows in set (0.00 sec) So this feels like it is indeed a configuration problem on the accounting somewhere. Usage is associated with the account properly: bash-4.2$ sacct -A seed | head JobID JobName Partition Account AllocCPUS State ExitCode ------------ ---------- ---------- ---------- ---------- ---------- -------- 512641_3 wrap all seed 4 COMPLETED 0:0 512641_3.ba+ batch seed 4 COMPLETED 0:0 512641_13 wrap all seed 4 COMPLETED 0:0 512641_13.b+ batch seed 4 COMPLETED 0:0 512641_14 wrap all seed 4 COMPLETED 0:0 512641_14.b+ batch seed 4 COMPLETED 0:0 512641_23 wrap all seed 4 COMPLETED 0:0 512641_23.b+ batch seed 4 COMPLETED 0:0 But not with the association: bash-4.2$ sacct -x 4 JobID JobName Partition Account AllocCPUS State ExitCode ------------ ---------- ---------- ---------- ---------- ---------- -------- bash-4.2$ sacct -x 0 | head JobID JobName Partition Account AllocCPUS State ExitCode ------------ ---------- ---------- ---------- ---------- ---------- -------- 512641_3 wrap all seed 4 COMPLETED 0:0 512641_3.ba+ batch seed 4 COMPLETED 0:0 512641_13 wrap all seed 4 COMPLETED 0:0 512641_13.b+ batch seed 4 COMPLETED 0:0 512641_14 wrap all seed 4 COMPLETED 0:0 512641_14.b+ batch seed 4 COMPLETED 0:0 512641_23 wrap all seed 4 COMPLETED 0:0 512641_23.b+ batch seed 4 COMPLETED 0:0
One final comment to anyone who might wander here - apparently the reason I was able to submit all these jobs was that AccountingStorageEnforce is not set by default.
I think the comment with my solution was lost - the problem was that I didn't associate users with the accounts, so even though the jobs were submitted with an account specified, the association wasn't found so the logging did not happen properly.