if you call squeue with —jobs and you pass it exactly one jobid that is invalid, it will return an error and exit with status=1. [Christopher.W.Harrop@Hera:hfe03 ~]$ /apps/slurm/default/bin/squeue --jobs=1 slurm_load_jobs error: Invalid job id specified [Christopher.W.Harrop@Hera:hfe03 ~]$ echo $? 1 However, if you call squeue with —jobs and you pass it more than one jobid, it will not report and error, and will return status=0 even if ALL the jobids are invalid. [Christopher.W.Harrop@Hera:hfe03 ~]$ /apps/slurm/default/bin/squeue --jobs=1,2 JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) [Christopher.W.Harrop@Hera:hfe03 ~]$ echo $? 0 This is inconsistent behavior that is confusing for people, such as our developers, who write code to interface with these commands. Thanks, Tony.
Hey Tony, Just to get some clarification on some things before I dig into the code. > if you call squeue with —jobs and you pass it exactly one jobid that is > invalid, it will return an error and exit with status=1. > > [Christopher.W.Harrop@Hera:hfe03 ~]$ /apps/slurm/default/bin/squeue --jobs=1 > slurm_load_jobs error: Invalid job id specified > [Christopher.W.Harrop@Hera:hfe03 ~]$ echo $? > 1 This seems correct to me, shooting out the error message along with the error code of 1. > However, if you call squeue with —jobs and you pass it more than one jobid, > it will not report and error, and will return status=0 even if ALL the > jobids are invalid. > > [Christopher.W.Harrop@Hera:hfe03 ~]$ /apps/slurm/default/bin/squeue > --jobs=1,2 > JOBID PARTITION NAME USER ST TIME NODES > NODELIST(REASON) > [Christopher.W.Harrop@Hera:hfe03 ~]$ echo $? > 0 This does seem odd though, I assume your devs are looking for a return code of 1 here correct? Are they also looking for an error message similar to the previous example? ~Colby
(In reply to Colby Ashley from comment #2) > > 0 > This does seem odd though, I assume your devs are looking for a return code > of 1 here correct? Are they also looking for an error message similar to the > previous example? > > ~Colby Colby - right, we're in agreement the first example is what we would expect. But the second leads to confusion - especially since both job ids are invalid (in this case). I can see an edge case where the job ids have some valid and some invalid - then what do you do? Perhaps the answer there is to produce both an output for the valid ids and an error message for the invalid ids. Thanks for the quick response - Tony.
Update: still looking into this, the error code is being returned in a special way so it will take some time to figure out.
Hey Tony, We have the ability to change a few things to print out an error code of 1 when all of the jobids are invalid. Without a major rewrite to squeue we cannot return an error code if some of the jobs are valid and some are not. Is this something you would still like done? You would still have to parse the output of squeue when running with multiple jobids. Though this could save a bit of time when all of the jobids are invalid. ~Colby
Closing reopen if you want the error code changed when all of the jobs are invalid.