[2015-07-13T13:32:17.017] _slurm_rpc_update_job complete JobId=4275962 uid=1260 usec=754 [2015-07-13T13:32:17.062] _part_access_check: uid 4294967294 access to partition teamoxford denied, bad group [2015-07-13T13:32:17.063] _part_access_check: uid 4294967294 access to partition idle denied, bad group [2015-07-13T13:32:17.063] _part_access_check: uid 4294967294 access to partition desktopBigMem denied, bad group [2015-07-13T13:32:17.063] update_job: setting partition to lud54 for job_id 4275963 Any idea how slurm got those uid? One of our user try to update his job using his own script/bash function. qu() { ### Queue Update if [[ -n "$1" ]]; then grep -v JOBID | awk -v pp=$1 '{ print( "scontrol update job="substr($9,1,7)" priority="pp" partition=teamoxford,idle,desktopBigMem,lud54 " ) ; system ( "scontrol update job="substr($9,1,7)" priority="pp" partition=teamoxford,idle,desktopBigMem,lud54 " ) }' else grep -v JOBID | awk '{ print( "scontrol update job="substr($9,1,7)" priority=500 partition=teamoxford,idle,desktopBigMem,lud54 " ) ; system( "scontrol update job="substr($9,1,7)" priority=500 partition=teamoxford,idle,desktopBigMem,lud54 " ) }' fi else echo "qu Error" fi } And some of the scontrol spit out those access denied error. I'm unable to reproduce his issue. Maybe the issue is gone after I restarted slurmctld.
The UID value of 4294967294 (or -1) reported by Slurm comes from the request's credential, which is generated by Munge. Munge initializes the UID and GID in its credentials with a value of -1. That gets changed once the credential is decoded. Munge should log an error otherwise. If you look at the Munge logs on both the client and server you should find some indication of why there was a failure. There may also be a munge error in slurmctld's log file before the lines you included in the first message. My best guess is that user does not have an account on the node where slurmctld runs. They don't need login access, but the account should exist. The default Munge log file location is "/var/log/munge/munged.log"
Were you able to determine the sourced of the bad Munge credential?
I'm unable to trace the cause and It seems like this has not happened anymore Thanks Moe, Akmal
What was the real id of that user? David
His real id is 1260 Akmal