This looks like a bug to me but it is only happening for a handful of users and it is repeatable. # sacctmgr -i create User Name='ly07336' Cluster='hipergator' Account='mckenna' DefaultAccount='mckenna' QOS='mckenna,mckenna-b' DefaultQOS='mckenna' Fairshare=parent Adding User(s) ly07336 Settings = Default Account = mckenna Associations = U = ly07336 A = mckenna C = hipergator Non Default Settings Fairshare = 2147483647 QOS = mckenna,mckenna-b -------------------------------------------------------------------------------- Looks good to me but then when I go to display the association, I get the following which is wrong. [root@slurm1 ufrc]# sacctmgr show user ly07336 withassoc User Def Acct Admin Cluster Account Partition Share MaxJobs MaxNodes MaxCPUs MaxSubmit MaxWall MaxCPUMins QOS Def QOS ---------- ---------- --------- ---------- ---------- ---------- --------- ------- -------- -------- --------- ----------- ----------- -------------------- --------- ly07336 None 0 0 0 0 0 00:00:00 0 Now, if I do the same for a random user name, say "xxxxxx". [root@slurm1 ufrc]# sacctmgr -i create User Name='xxxxxx' Cluster='hipergator' Account='mckenna' DefaultAccount='mckenna' QOS='mckenna,mckenna-b' DefaultQOS='mckenna' Fairshare=parent Adding User(s) xxxxxx Settings = Default Account = mckenna Associations = U = xxxxxx A = mckenna C = hipergator Non Default Settings Fairshare = 2147483647 QOS = mckenna,mckenna-b And then show the association, it is fine... [root@slurm1 ufrc]# sacctmgr show user xxxxxx withassoc User Def Acct Admin Cluster Account Partition Share MaxJobs MaxNodes MaxCPUs MaxSubmit MaxWall MaxCPUMins QOS Def QOS ---------- ---------- --------- ---------- ---------- ---------- --------- ------- -------- -------- --------- ----------- ----------- -------------------- --------- xxxxxx mckenna None hipergator mckenna parent mckenna,mckenna-b mckenna I feel like the database may be corrupted or otherwise have values that sacctmgr is not showing. Any ideas. Anyone else reporting this?
Looking at the slurm mysql database, I found that the user names that were causing problems were in the database (deleted=0) even though they had been deleted with sacctmgr. This was not the case with other users or the random "xxxxxx" user in the example. Not sure why that would be.
The three user associations that had this problem were, ly07336 wperry rmckenna The problem seems to have been fixed by going into mysql and deleting the associated entries from the relevant tables. mysql> use slurm; mysql> delete from user_table where name = 'rmckenna'; Query OK, 1 row affected (0.00 sec) mysql> delete from hipergator_assoc_table where user = 'rmckenna'; Query OK, 2 rows affected (0.00 sec)
This version of Slurm is no longer supported.