Ticket 150

Summary: sacct -l output
Product: Slurm Reporter: Don Lipari <lipari1>
Component: Bluegene select pluginAssignee: Danny Auble <da>
Status: RESOLVED FIXED QA Contact:
Severity: 4 - Minor Issue    
Priority: ---    
Version: 2.4.x   
Hardware: IBM BlueGene   
OS: Linux   
Site: LLNL Slinky Site: ---
Alineos Sites: --- Atos/Eviden Sites: ---
Confidential Site: --- Coreweave sites: ---
Cray Sites: --- DS9 clusters: ---
Google sites: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- NoveTech Sites: ---
Nvidia HWinf-CS Sites: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Tzag Elita Sites: ---
Linux Distro: --- Machine Name:
CLE Version: Version Fixed:
Target Release: --- DevPrio: ---
Emory-Cloud Sites: ---
Attachments: Patch to make sacct not print errors on systems like BGQ on sub node jobs

Description Don Lipari 2012-10-23 08:57:30 MDT
For those looking to invoke the sacct -l option on BG/Q machines, all of the following fields display an error on certain job steps:  MaxVMSizeNode, MaxRSSNode, MaxPagesNode, and MinCPUNode.

Here's what the error looks like:
sacct: error: hostlist.c:1774 Invalid range: `10000x13331': Invalid argument
                            0          0          0
sacct: error: hostlist.c:1774 Invalid range: `10000x13331': Invalid argument

I suspected this was due to the dimension of the sub-block nodelist range being > what slurmdb_setup_cluster_dims() returns.  However, the following job step zero output displays fine:

sacct -a -o JobID,JobName,Partition,nodelist%30
       JobID    JobName  Partition                       NodeList 
------------ ---------- ---------- ------------------------------ 
22229             runit     pdebug                     vulcan0020 
22229.batch       batch                                vulcan0020 
22229.0      /nfs/tmp2+                   vulcan0020[10000x13331] 
22230             runit     pdebug                     vulcan0020 
22230.batch       batch                                vulcan0020 
22230.0      /nfs/tmp2+                   vulcan0020[10000x13331]

So I don't know what the optimal fix is:  fix the display or eliminate the problem fields from the sacct -l output on BG/Q systems.
Comment 1 Don Lipari 2012-10-23 09:12:36 MDT
For the record, I just noticed that the max_pages_node, max_rss_node, max_vsize_node, and min_cpu_node fields in the database for a job step are all '0' (not surprising really).  There is absolutely no node range to display!
Comment 2 Danny Auble 2012-10-23 09:29:20 MDT
None of this makes any since unless you are actually using jobacct_gather.  I would suggest alter sacct/process.c find_hostname() to just return if using a front_end system.  What do you think?
Comment 3 Danny Auble 2012-10-23 09:33:07 MDT
But this doesn't work on a multi cluster system since sacct doesn't get information about the cluster when it goes to get information about the job.
Comment 4 Don Lipari 2012-10-23 09:37:50 MDT
(In reply to comment #3)
> But this doesn't work on a multi cluster system since sacct doesn't get
> information about the cluster when it goes to get information about the job.

I suppose you could change find_hostname() to return NULL when the hosts is whatever it is when the field from the db is '0'.
Comment 5 Danny Auble 2012-10-23 09:42:26 MDT
I don't understand what you are referring to as 0?  I would expect 0 to be a valid number for at least some jobs for any field.  What field are you referring to?
Comment 6 Don Lipari 2012-10-23 09:46:50 MDT
(In reply to comment #5)
> I don't understand what you are referring to as 0?  I would expect 0 to be a
> valid number for at least some jobs for any field.  What field are you
> referring to?

max_pages_node, max_rss_node, max_vsize_node, and min_cpu_node

I'm assuming on a Linux cluster with the jobacct_gather enabled, these fields are populated with node name.
Comment 7 Danny Auble 2012-10-23 09:48:02 MDT
You assumption would be wrong ;).  They contain a node index, so 0 is quite valid :).
Comment 8 Danny Auble 2012-10-23 09:54:25 MDT
The only thing I can think of is alter sacct to look up the cluster before hand so it knows what to do in this case.
Comment 9 Don Lipari 2012-10-23 09:59:39 MDT
(In reply to comment #8)
> The only thing I can think of is alter sacct to look up the cluster before
> hand so it knows what to do in this case.

I suppose if there are no identifying characteristics in the contents of the four *_node fields when the node index is '0', then looking up the cluster sounds like a reasonable solution.
Comment 10 Danny Auble 2012-10-24 06:02:17 MDT
Created attachment 138 [details]
Patch to make sacct not print errors on systems like BGQ on sub node jobs

Here is a patch for 2.4 that fixes this. You will have to update the slurmdbd, slurmctld and sacct to make it work correctly.

Since this changes behaviour I am going to put it in 2.5 instead of 2.4.

You can update the database for older jobs as well by setting the respected nodeid's to 4294967294 instead of 0 in the step_table of each cluster.
Comment 11 Danny Auble 2012-10-26 08:12:14 MDT
FYI, this has been fixed completely in 2.5.  Now if a cluster isn't running a real jobacct_gather plugin all the statistics gathered by the plugin will be blank in the output of sacct.