Ticket 16830 - sacct sometimes reports the wrong uid
Summary: sacct sometimes reports the wrong uid
Status: OPEN
Alias: None
Product: Slurm
Classification: Unclassified
Component: Accounting (show other tickets)
Version: 23.02.2
Hardware: Linux Linux
: 6 - No support contract
Assignee: Jacob Jenson
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2023-05-25 18:15 MDT by Joseph Guzman
Modified: 2023-05-25 18:16 MDT (History)
0 users

See Also:
Site: -Other-
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: RHEL
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description Joseph Guzman 2023-05-25 18:15:37 MDT
We noticed that since updating to Slurm v23.02.2 from v22.05.7 that sacct was reporting jobs from two users when specifying a single user with the -u flag. When jobs from the unexpected 2nd user were shown in the sacct output, we noticed that sacct would also report the uid to be the same for those jobs, which differs from that reported by standard system utilities on all the hosts.

For example like this:

$ sacct -u user1 -X -P -o jobid,timelimit,user,uid -S 2023-05-23 --noheader
1|01:00:00|user2|16074
4|01:00:00|user2|16074
7|01:00:00|user2|16074
9|01:00:00|user2|16074
79|01:00:00|user2|16074
86|01:00:00|user2|16074
89|01:00:00|user2|16074
90|01:00:00|user2|16074
180|01:00:00|user2|16074
181|01:00:00|user2|16074
185|01:00:00|user2|16074
186|01:00:00|user2|16074
194|01:00:00|user2|16074
47879|04:00:00|user1|16074
47912|02:00:00|user1|16074
47917|02:00:00|user1|16074
47918|04:00:00|user1|16074
47938|02:00:00|user1|16074
47941|02:00:00|user1|16074
47942|02:00:00|user1|16074
$

After looping through sacct output, we found that sacct reported the wrong uid for these 248 instances, and we can confirm that the username was correct for those jobids judging by the SubmitLine values that we checked. When there was a user-to-uid mismatch from sacct, it was the same wrong uid each time, which was valid but for another user. The mismatches occur for a small minority of jobs submitted by several other users.

Are there any known problems from importing the slurm database from a v22.05.7 slurm instance to a v23.02.2 one? Or what could be the issue here? We're using slurmdbd for accounting.

Thanks,

Joseph