Ticket 14374

Summary: scontrol show job chops pathnames into multiple lines
Product: Slurm Reporter: Ole.H.Nielsen <Ole.H.Nielsen>
Component: User CommandsAssignee: Director of Support <support>
Status: RESOLVED INFOGIVEN QA Contact:
Severity: 4 - Minor Issue    
Priority: ---    
Version: 21.08.8   
Hardware: Linux   
OS: Linux   
Site: DTU Physics Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description Ole.H.Nielsen@fysik.dtu.dk 2022-06-22 03:51:14 MDT
We parse the output of "scontrol show job <jobid>" when generating certain reports to users.  Unfortunately, long file pathnames in the output gets chopped into multiple lines, and I can't see any way of restoring the original pathnames.

Can this pathname chopping be removed, or is there a way to filter the output of scontrol to generate the unchopped output?

Here is an example job with long pathnames:

$ scontrol show job 5127762 
JobId=5127762 JobName=defect_Si_N_220620_CZe8SrKI
   UserId=kenghua(283834) GroupId=qwise(17000) MCS_label=N/A
   Priority=337923 Nice=0 Account=qwise QOS=normal
   JobState=RUNNING Reason=None Dependency=(null)
   Requeue=0 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
   RunTime=1-00:22:01 TimeLimit=2-02:00:00 TimeMin=N/A
   SubmitTime=2022-06-20T04:55:53 EligibleTime=2022-06-20T04:55:53
   AccrueTime=2022-06-20T04:55:53
   StartTime=2022-06-21T11:20:53 EndTime=2022-06-23T13:20:53 Deadline=N/A
   SuspendTime=None SecsPreSuspend=0 LastSchedEval=2022-06-21T11:20:53 Scheduler=Main
   Partition=xeon40 AllocNode:Sid=sylg:26070
   ReqNodeList=(null) ExcNodeList=(null)
   NodeList=a127
   BatchHost=a127
   NumNodes=1 NumCPUs=40 NumTasks=40 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
   TRES=cpu=40,mem=368000M,node=1,billing=66
   Socks/Node=* NtasksPerN:B:S:C=40:0:*:* CoreSpec=*
   MinCPUsNode=40 MinMemoryCPU=9200M MinTmpDiskNode=0
   Features=(null) DelayBoot=00:00:00
   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
   Command=/home/qwise/kenghua/SiN_1.28/sample1
/defect_Si_N_220620_CZe8SrKI/run.script.defect_Si_N_220620_CZe8SrKI
   WorkDir=/home/qwise/kenghua/SiN_1.28/sample1
/defect_Si_N_220620_CZe8SrKI
   StdErr=/home/qwise/kenghua/SiN_1.28/sample1
/defect_Si_N_220620_CZe8SrKI/defect_Si_N_220620_CZe8SrKI.log
   StdIn=/dev/null
   StdOut=/home/qwise/kenghua/SiN_1.28/sample1
/defect_Si_N_220620_CZe8SrKI/defect_Si_N_220620_CZe8SrKI.log
   Power=

Please notice the chopped lines for the Command, WorkDir, StdErr, StdOut pathnames.
Comment 2 Caden Ellis 2022-06-22 16:51:28 MDT
Hello, 

Have you tried the -o (--oneliner) option?

scontrol show job 5127762 -o

This should print everything on one line.

https://slurm.schedmd.com/scontrol.html#OPT_oneliner

Best regards,

Caden Ellis
Comment 3 Ole.H.Nielsen@fysik.dtu.dk 2022-06-22 23:56:47 MDT
(In reply to Caden Ellis from comment #2)
> Have you tried the -o (--oneliner) option?
> 
> scontrol show job 5127762 -o
> 
> This should print everything on one line.

Yes, I had already tried the -o and it also chops pathnames:

$ scontrol -o show job 5127762
JobId=5127762 JobName=defect_Si_N_220620_CZe8SrKI UserId=kenghua(283834) GroupId=qwise(17000) MCS_label=N/A Priority=337923 Nice=0 Account=qwise QOS=normal JobState=RUNNING Reason=None Dependency=(null) Requeue=0 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0 RunTime=1-00:14:55 TimeLimit=2-02:00:00 TimeMin=N/A SubmitTime=2022-06-20T04:55:53 EligibleTime=2022-06-20T04:55:53 AccrueTime=2022-06-20T04:55:53 StartTime=2022-06-21T11:20:53 EndTime=2022-06-23T13:20:53 Deadline=N/A SuspendTime=None SecsPreSuspend=0 LastSchedEval=2022-06-21T11:20:53 Scheduler=Main Partition=xeon40 AllocNode:Sid=sylg:26070 ReqNodeList=(null) ExcNodeList=(null) NodeList=a127 BatchHost=a127 NumNodes=1 NumCPUs=40 NumTasks=40 CPUs/Task=1 ReqB:S:C:T=0:0:*:* TRES=cpu=40,mem=368000M,node=1,billing=66 Socks/Node=* NtasksPerN:B:S:C=40:0:*:* CoreSpec=* MinCPUsNode=40 MinMemoryCPU=9200M MinTmpDiskNode=0 Features=(null) DelayBoot=00:00:00 OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null) Command=/home/qwise/kenghua/SiN_1.28/sample1
/defect_Si_N_220620_CZe8SrKI/run.script.defect_Si_N_220620_CZe8SrKI WorkDir=/home/qwise/kenghua/SiN_1.28/sample1
/defect_Si_N_220620_CZe8SrKI StdErr=/home/qwise/kenghua/SiN_1.28/sample1
/defect_Si_N_220620_CZe8SrKI/defect_Si_N_220620_CZe8SrKI.log StdIn=/dev/null StdOut=/home/qwise/kenghua/SiN_1.28/sample1
/defect_Si_N_220620_CZe8SrKI/defect_Si_N_220620_CZe8SrKI.log Power= 


Probably the pathname chopping in scontrol needs to be addressed.

Thanks,
Ole
Comment 4 Caden Ellis 2022-06-23 14:49:58 MDT
Hello Ole,

That is interesting behavior. I was not able to reproduce this issue with normal long name file paths. 

... WorkDir=/tmp/qwise/kenghua/SiN_1.28/sample1/defect_Si_N_220620_CZe8SrKI/run.script.defect_Si_N_220620_CZe8SrKI
   StdErr=/tmp/qwise/kenghua/SiN_1.28/sample1/defect_Si_N_220620_CZe8SrKI/run.script.defect_Si_N_220620_CZe8SrKI/slurm-109.out
   StdIn=/dev/null
   StdOut=/tmp/qwise/kenghua/SiN_1.28/sample1/defect_Si_N_220620_CZe8SrKI/run.script.defect_Si_N_220620_CZe8SrKI/slurm-109.out
...

(In case it is not clear, the path is all on one line)

I did replicate your issue once I put a newline after "sample1" in the file path. 

...
   WorkDir=/tmp/qwise/kenghua/SiN_1.28/sample1
/defect_Si_N_220620_CZe8SrKI/run.script.defect_Si_N_220620_CZe8SrKI
   StdErr=/tmp/qwise/kenghua/SiN_1.28/sample1
/defect_Si_N_220620_CZe8SrKI/run.script.defect_Si_N_220620_CZe8SrKI/slurm-110.out
   StdIn=/dev/null
   StdOut=/tmp/qwise/kenghua/SiN_1.28/sample1
/defect_Si_N_220620_CZe8SrKI/run.script.defect_Si_N_220620_CZe8SrKI/slurm-110.out
...

When I do "ls -b" in the directory above "sample1", it shows like this in my Ubuntu terminal:

sample1\n

Also, sacct --json would show you if there are newline characters in the file paths:

"...ise\/kenghua\/SiN_1.28\/sample1\n\/defect_Si_N_220620_CZe8SrKI\/run.scri..."

Can you verify if there is a newline after your sample1 directory name?
Comment 5 Ole.H.Nielsen@fysik.dtu.dk 2022-06-23 23:55:06 MDT
Hi Caden,

(In reply to Caden Ellis from comment #4)
> That is interesting behavior. I was not able to reproduce this issue with
> normal long name file paths. 
> 
> ...
> WorkDir=/tmp/qwise/kenghua/SiN_1.28/sample1/defect_Si_N_220620_CZe8SrKI/run.
> script.defect_Si_N_220620_CZe8SrKI
>   
> StdErr=/tmp/qwise/kenghua/SiN_1.28/sample1/defect_Si_N_220620_CZe8SrKI/run.
> script.defect_Si_N_220620_CZe8SrKI/slurm-109.out
>    StdIn=/dev/null
>   
> StdOut=/tmp/qwise/kenghua/SiN_1.28/sample1/defect_Si_N_220620_CZe8SrKI/run.
> script.defect_Si_N_220620_CZe8SrKI/slurm-109.out
> ...
> 
> (In case it is not clear, the path is all on one line)
> 
> I did replicate your issue once I put a newline after "sample1" in the file
> path. 

Ah, that sounds really weird, I've never encountered such file names with newlines before!  I did a check of the user's folder now:

[root@niflfs1 SiN_1.28]# pwd
/home/qwise/kenghua/SiN_1.28
[root@niflfs1 SiN_1.28]# ls -l
total 112
-rw-r--r--. 1 kenghua qwise 110592 Jun 20 04:56 project.nl
drwxr-xr-x. 2 kenghua qwise    125 Jun 20 04:47 sample1
drwxr-xr-x. 5 kenghua qwise    109 Jun 20 04:55 sample1?
-rw-r--r--. 1 kenghua qwise   1734 Jun 16 07:17 submission_setup
[root@niflfs1 SiN_1.28]# ls -1d sample* | od -c
0000000   s   a   m   p   l   e   1  \n   s   a   m   p   l   e   1  \n
0000020  \n
0000021

Oddly, there is a folder named "sample1\n" so you were right!

Thanks for the feedback, and we can close this case now.

Best regards,
Ole