Ticket 4490

Summary: With sacct, using --units corrupts the output of nnodes
Product: Slurm Reporter: Loris Bennett <loris.bennett>
Component: User CommandsAssignee: Felip Moll <felip.moll>
Status: RESOLVED FIXED QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: chris, felip.moll
Version: 17.02.7   
Hardware: Linux   
OS: Linux   
Site: Swinburne Slinky Site: ---
Alineos Sites: --- Atos/Eviden Sites: ---
Confidential Site: --- Coreweave sites: ---
Cray Sites: --- DS9 clusters: ---
Google sites: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- NoveTech Sites: ---
Nvidia HWinf-CS Sites: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Tzag Elita Sites: ---
Linux Distro: --- Machine Name:
CLE Version: Version Fixed: 17.11.1
Target Release: --- DevPrio: ---
Emory-Cloud Sites: ---

Description Loris Bennett 2017-12-08 00:26:15 MST
With sacct, if the option '--units' is used when the column 'nnodes' is displayed, the output of 'nnodes' is corrupted, e.g.

$ sacct -S 2017-12-08T06:00:00 -s CD -o jobid,maxrss,nnodes --units=G | head
       JobID     MaxRSS   NNodes 
------------ ---------- -------- 
1875298                    0.00G 
1875298.bat+      3.62G    0.00G 
1875303                    0.00G 
1875303.bat+      3.57G    0.00G 
1875318                    0.00G 
1875318.bat+      3.63G    0.00G 
1875330                    0.00G 
1875330.bat+      4.84G    0.00G 

Without '--units' the output is correct:

$ sacct -S 2017-12-08T06:00:00 -s CD -o jobid,maxrss,nnodes | head
       JobID     MaxRSS   NNodes 
------------ ---------- -------- 
1875263                        1 
1875263.bat+   4510920K        1 
1875298                        1 
1875298.bat+   3792772K        1 
1875303                        1 
1875303.bat+   3739416K        1 
1875318                        1 
1875318.bat+   3806832K        1 

According to another user on another site, this error also occurs in version 17.11.0.
Comment 1 Christopher Samuel 2017-12-09 20:59:37 MST
I can confirm this bug at Swinburne (who do have a support contract).

$ sacct -u csamuel -o jobid,nnodes,ncpus,reqmem,maxrss,elapsed -S 2017-07-01 --units=G
       JobID   NNodes      NCPUS     ReqMem     MaxRSS    Elapsed
------------ -------- ---------- ---------- ---------- ----------
2               0.00G          1     0.10Gc              00:00:00
2.batch         0.00G          1     0.10Gc              00:00:00
3               0.00G          1     0.10Gc              00:00:00
3.batch         0.00G          1     0.10Gc              00:00:00

$ sacct --version
slurm 17.11.0
Comment 4 Felip Moll 2017-12-19 02:07:00 MST
Hi Loris/Chirstopher,

This is just a quick update to inform you that we have already identified the problem and we have a patch pending for review and commit. Will be fixed officially asap.

Thanks
Felip M
Comment 5 Christopher Samuel 2017-12-19 04:59:37 MST
On 19/12/17 8:07 pm, bugs@schedmd.com wrote:

> This is just a quick update to inform you that we have already
> identified the problem and we have a patch pending for review and
> commit. Will be fixed officially asap.

Great news, thanks Felip!
Comment 7 Felip Moll 2017-12-20 03:01:12 MST
(In reply to Christopher Samuel from comment #5)
> On 19/12/17 8:07 pm, bugs@schedmd.com wrote:
> 
> > This is just a quick update to inform you that we have already
> > identified the problem and we have a patch pending for review and
> > commit. Will be fixed officially asap.
> 
> Great news, thanks Felip!

Hi,

Fix for this issue is committed in 763e396e575d69176838f47d3c194df708e621d4, available on 17.11.1 and up.

Thanks for reporting,
Felip M