Ticket 5813 - Users with capital letters in their usernames cannot submit jobs
Summary: Users with capital letters in their usernames cannot submit jobs
Status: RESOLVED FIXED
Alias: None
Product: Slurm
Classification: Unclassified
Component: slurmctld (show other tickets)
Version: 17.11.7
Hardware: Linux Linux
: 2 - High Impact
Assignee: Tim Wickberg
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2018-10-04 11:49 MDT by Steve Ford
Modified: 2018-10-17 14:51 MDT (History)
1 user (show)

See Also:
Site: MSU
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed: 18.08.2
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments
slurm.conf (3.05 KB, text/plain)
2018-10-04 11:49 MDT, Steve Ford
Details
sacctmgr patch (556 bytes, patch)
2018-10-17 12:11 MDT, Brian Christiansen
Details | Diff
dbd upper patch (1001 bytes, patch)
2018-10-17 13:38 MDT, Brian Christiansen
Details | Diff

Note You need to log in before you can comment on or make changes to this ticket.
Description Steve Ford 2018-10-04 11:49:46 MDT
Created attachment 7957 [details]
slurm.conf

We originally suspected this was a problem with our job submit script, however, this issue persisted after disabling the lua job submit plugin.

These users exist in the accounting database and have access to the account and partition they are submitting to. Their jobs are rejected with "srun: error: Unable to allocate resources: Invalid account or account/partition combination specified"


Here are steps to reproduce this issue:

# useradd UPPER
# sacctmgr add user UPPER account=general
There is no uid for user 'upper'
Are you sure you want to continue? (You have 30 seconds to decide)
(N/y): y
 Adding User(s)
  upper
 Associations =
  U = upper     A = general    C = linux     
 Non Default Settings
Would you like to commit changes? (You have 30 seconds to decide)
(N/y): y
# su - UPPER
$ srun -p OneNode -A general hostname
srun: error: Unable to allocate resources: Invalid account or account/partition combination specified
Comment 1 Tim Wickberg 2018-10-04 12:17:20 MDT
By default, slurmdbd lower-cases everything. That's why sacctmgr is printing back  the lower case name and asking for you to confirm that is who you wish to add. This has been a very-long standing part of our accounting system, as documented as an intentional limitation.

In 18.08, we have added Parameters=PreserveCaseUser as a config option to slurmdbd.conf which disables this case normalization.

You should be able to take advantage of that just with an upgraded slurmdbd process, although we would obviously recommend upgrading the rest of the cluster to 18.08 in the near term as well.

Re-tagging this as Sev3 - this is a known limitation, and is only affecting a subset of your users, and as such should not have been submitted at Sev1. Please see https://www.schedmd.com/support.php for further details outlining this.

- Tim
Comment 2 Steve Ford 2018-10-04 12:28:58 MDT
Thanks. We'll work on updating to 18.08.
Comment 3 Tim Wickberg 2018-10-04 18:17:30 MDT
Marking this closed as a duplicate of bug 5432 which added in this new option.

*** This ticket has been marked as a duplicate of ticket 5432 ***
Comment 4 Steve Ford 2018-10-16 14:43:51 MDT
Tim,

We updated to 18.08.1 and set Parameters=PreserveCaseUser in slurmdbd.conf. After deleting the all lowercase entries from the database, the users are added and can submit jobs. Unfortunately, we found that restarting the slurmctld service causes these users to be unable to submit jobs until they are delete from and re-added to to the database. This does not affect users with capital letters in their names that were added to the database after the update.

Another odd behavior is that users who were in the database before the update have their usernames displayed as all lowercase and no default account in the sacctmgr output, even after being deleted and re-added with the proper case.

For example:

# sacctmgr show user xxxxxxxXMICH
      User   Def Acct     Admin 
---------- ---------- --------- 
xxxxxxxxm+                 None 
# sacctmgr del user xxxxxxxXMICH
 Deleting users...
  xxxxxxxxmich
Would you like to commit changes? (You have 30 seconds to decide)
(N/y): y
# sacctmgr add user xxxxxxxXMICH account=xmich
 Adding User(s)
  xxxxxxxXMICH
 Associations =
  U = xxxxxxxXM A = xmich      C = msuhpcc   
 Non Default Settings
Would you like to commit changes? (You have 30 seconds to decide)
(N/y): y
# sacctmgr show user xxxxxxxXMICH
      User   Def Acct     Admin 
---------- ---------- --------- 
xxxxxxxxm+                 None


Any ideas on what might be causing this?


Thanks,
Steve
Comment 5 Steve Ford 2018-10-17 06:53:33 MDT
I'm updating this to Sev 2 since this is preventing a large set of our users from submitting jobs.
Comment 6 Brian Christiansen 2018-10-17 09:48:05 MDT
Hey Steve,

I can reproduce this and know what's going on. Let me think through it a little more and will get back to you with instructions.

Thanks,
Brian
Comment 7 Brian Christiansen 2018-10-17 12:11:13 MDT
Created attachment 8045 [details]
sacctmgr patch

ok. I'm working on patches that will solve this problem. The problem is that the previous lower case names were still in the database (just marked as deleted) when they got re-added and the name wasn't updated with the new case sensitive name.

There are a couple ways of fixing the issue, 1. directly in the database or 2. using a small patch and using sacctmgr. Option 2 may be the simplest since it will handle the user table and the account coordinator table and will push the updates to the controller (so you don't need to restart the slurmctld).

The attached patch will allow you to do:
sacctmgr mod user <user_name> set newname=<case sensitive name>

After you do this, you should be able to "sacctmgr show user", should show show the default account again and the user should be able to submit jobs again.

Can you apply the patch and try using newname? Let me know how it goes.
Comment 8 Steve Ford 2018-10-17 13:06:52 MDT
Brian,

I applied the patch and updated a user like you suggested and their default account shows in the output of sacctmgr. Their username, however, is still all lower case and, like before, they cannot submit jobs after slurmctld is restarted unless they are deleted from and re-added to the database. Also, after re-adding them, their default account is missing again.

While we do want sacctmgr fixed, we am mostly interested in resolving the issue that's requiring us to delete and re-add users each time slurmctld is restarted.

Thanks,
Steve
Comment 9 Brian Christiansen 2018-10-17 13:38:08 MDT
Created attachment 8049 [details]
dbd upper patch

Hmm. Not sure why that didn't work. I've attached the patch to update the user name at addition. Can you try this patch? The slurmdbd only needs to be updated.

Do you use account coordinators as well?

e.g.
brian@lappy:~/slurm/18.08/lappy$ sacctmgr show user upper2
      User   Def Acct  Def WCKey     Admin 
---------- ---------- ---------- --------- 

brian@lappy:~/slurm/18.08/lappy$ sacctmgr show user upper2 withdeleted
      User   Def Acct  Def WCKey     Admin 
---------- ---------- ---------- --------- 
    upper2      stuff                 None 

brian@lappy:~/slurm/18.08/lappy$ sacctmgr show assoc user=upper2 format=cluster,account,user
   Cluster    Account       User
---------- ---------- ----------

brian@lappy:~/slurm/18.08/lappy$ sacctmgr show assoc user=upper2 withdeleted user=upper2 format=cluster,account,user
   Cluster    Account       User
---------- ---------- ----------
    lappy2      stuff     upper2
     lappy      stuff     upper2

brian@lappy:~/slurm/18.08/lappy$ sacctmgr add user UPPER2 account=stuff
 Adding User(s)
  UPPER2
 Associations =
  U = UPPER2    A = stuff      C = lappy     
  U = UPPER2    A = stuff      C = lappy2    
 Non Default Settings
Would you like to commit changes? (You have 30 seconds to decide)
(N/y): y

brian@lappy:~/slurm/18.08/lappy$ sacctmgr show assoc user=upper2 format=cluster,account,user
   Cluster    Account       User
---------- ---------- ----------
    lappy2      stuff     UPPER2
     lappy      stuff     UPPER2

brian@lappy:~/slurm/18.08/lappy$ sacctmgr show user upper2
      User   Def Acct  Def WCKey     Admin 
---------- ---------- ---------- --------- 
    UPPER2      stuff                 None 

brian@lappy:~/slurm/18.08/lappy$ scontrol show assoc flags=users user=UPPER2
Current Association Manager state

User Records

UserName=UPPER2(1002) DefAccount=stuff DefWckey=(null) AdminLevel=Not Set





And an example of newname=

brian@lappy:~/slurm/18.08/lappy$ sacctmgr show user upper2
      User   Def Acct  Def WCKey     Admin 
---------- ---------- ---------- --------- 
    upper2      stuff                 None 

brian@lappy:~/slurm/18.08/lappy$ sacctmgr show assoc user=upper2 format=cluster,account,user
   Cluster    Account       User 
---------- ---------- ---------- 
    lappy2      stuff     upper2 
     lappy      stuff     upper2 

brian@lappy:~/slurm/18.08/lappy$ scontrol show assoc flags=users user=UPPER2
Current Association Manager state

User Records

UserName=upper2(4294967294) DefAccount=stuff DefWckey=(null) AdminLevel=Not Set

brian@lappy:~/slurm/18.08/lappy$ sacctmgr mod user upper2 set newname=UPPER2
 Modified users...
  upper2
Would you like to commit changes? (You have 30 seconds to decide)
(N/y): y

brian@lappy:~/slurm/18.08/lappy$ sacctmgr show user upper2
      User   Def Acct  Def WCKey     Admin 
---------- ---------- ---------- --------- 
    UPPER2      stuff                 None 

brian@lappy:~/slurm/18.08/lappy$ sacctmgr show assoc user=upper2 format=cluster,account,user
   Cluster    Account       User 
---------- ---------- ---------- 
    lappy2      stuff     UPPER2 
     lappy      stuff     UPPER2 

brian@lappy:~/slurm/18.08/lappy$ scontrol show assoc flags=users user=UPPER2
Current Association Manager state

User Records

UserName=UPPER2(1002) DefAccount=stuff DefWckey=(null) AdminLevel=Not Set
Comment 11 Steve Ford 2018-10-17 14:12:22 MDT
Brian,

I applied the dbd patch and everything looks good! After deleting and re-adding these users, they display properly in sacctmgr and they can continue to submit jobs after slurmctld is restarted.

Thank you!
Comment 12 Brian Christiansen 2018-10-17 14:13:20 MDT
Awesome! Glad to hear. I'll let you know when we get the patches committed.
Comment 14 Brian Christiansen 2018-10-17 14:51:43 MDT
Hey Steve,

The patches have been committed and will be available in 18.08.2:

https://github.com/SchedMD/slurm/commit/1af72a1781437df5dccb264211a23962e39314dc
https://github.com/SchedMD/slurm/commit/ceca378cfaacbc9b9da9294fc5d3184b292ac2f7

Additionally, if these users were coordinators they will need to get re-added as well to correct their user names.

Please reopen if you have any other issues.

Thanks,
Brian