Hello, I have an observant user pointing out that "squeue -o%all" can produce invalid output which is causing some level of distress. squeue -l -j <job> -o %all Tue Sep 13 16:00:38 2016 ACCOUNT|GRES|MIN_CPUS|MIN_TMP_DISK|END_TIME|FEATURES|GROUP|OVER_SUBSCRIBE|JOBID|NAME|COMMENT|TIME_LIMIT|MIN_MEMORY|REQ_NODES|COMMAND|PRIORITY|QOS|REASON|PsÈ/ü|ST|USER|RESERVATION|WCKEY|EXC_NODES|NICE|S:C:T|JOBID|EXEC_HOST|CPUS|NODES|DEPENDENCY|ARRAY_JOB_ID|GROUP|SOCKETS_PER_NODE|CORES_PER_SOCKET|THREADS_PER_CORE|ARRAY_TASK_ID|TIME_LEFT|TIME|NODELIST|CONTIGUOUS|PARTITION|PRIORITY|NODELIST(REASON)|START_TIME|STATE|USER|SUBMIT_TIME|LICENSES|CORE_SPEC|SCHEDNODES|WORK_DIR ... Note that between "REASON" and "ST" things get silly. Based on the rather clever way %all works: https://github.com/SchedMD/slurm/blob/slurm-17.11/src/squeue/opts.c#L567 we can infer this is caused by "%s" (since it sits between "%r" and "%t". The user complained about this some time ago (16.05 or 17.02) and I'm just finding the ticket now. In any case. I still see random invalid data in 17.11.5: dmj@edison01:~> squeue --format="%s" | sort | tail Md�h�* dmj@edison01:~> It appears that the invalid output I'm seeing in cori's current queue is in the header. It seems that this is because _print_job_select_jobinfo is trying to run: select_g_select_jobinfo_sprint(NULL, select_buf, sizeof(select_buf), SELECT_PRINT_HEAD); I'm guessing select/cray is not implementing this correctly. Please keep in mind that we need select/cray to do the right thing here whether or not the slurm build is for cray (our elogin nodes where squeue runs is not built for native cray). However, the invalid output is present on both native cray and linux builds when select/cray is the select plugin. Looks like this is called out as a FIXME: https://github.com/SchedMD/slurm/blob/de4c76ebfe53628be255d2dbf30a2c45631776cb/src/plugins/select/cray/select_cray.c#L2597 The time has arrived. =) Thanks, Doug
select/alps prints a header select/cons_res puts in an empty string (without checking buffer length) select/linear same select/serial same select/cray explicitly causes trouble (subject of this ticket) select/bluegene is complicated but there seems like all should at least put in a header if only so there isn't a mysterious empty column in %all output
Created attachment 6656 [details] give %s format option a header in squeue
Hi Doug, The solution for this issue is in commit d3398004245fcc29c5c8f93311957fb3960dc6b2 It has been decided that it will be treated in the cray plugin as it is in the serial, linear, and cons_res plugins -- returning an empty string for %s. This is the solution for 17.11. We intend to either remove the column or give it a header independent of the plugin in 18.08.
I'm going to close this ticket. Please reopen it should you have any further issues with the squeue %s option.