Ticket 3132

Summary: scontrol/sacct output inconsistent in epilog
Product: Slurm Reporter: Davide Vanzo <davide.vanzo>
Component: SchedulingAssignee: Tim Wickberg <tim>
Status: RESOLVED INFOGIVEN QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: brian.gilmer
Version: 16.05.4   
Hardware: Linux   
OS: Linux   
See Also: https://bugs.schedmd.com/show_bug.cgi?id=3207
Site: Vanderbilt Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- NoveTech Sites: ---
Nvidia HWinf-CS Sites: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Tzag Elita Sites: ---
Linux Distro: --- Machine Name:
CLE Version: Version Fixed:
Target Release: --- DevPrio: ---
Emory-Cloud Sites: ---

Description Davide Vanzo 2016-09-29 09:28:28 MDT
We have a user that needs to keep track of the used memory and job duration at the end of each job. Under Torque/Moab he did so by using an epilog script. He then tried to convert the same setup for Slurm. His problem is that when the epilog script is invoked by an srun in a batch script submitted with sbatch scontrol reports the job as still running or completing (instead of completed) and the wall time and memory usage reported by sacct are also different from the ones that can be obtained after the job is finished. The interesting thing is that if the job is submitted directly with srun (i.e. without using sbatch), scontrol and sacct called in the epilog script return the correct values.
Any idea of what is possibly going on here?

Davide
Comment 1 Moe Jette 2016-09-29 10:13:51 MDT
There is mail program, "smail", that comes with Slurm in the "contribs/seff" directory and in the "slurm-seff" RPM that will probably do what you want. It waits for the job to complete and then includes accounting information in the email.
Comment 2 Davide Vanzo 2016-09-29 10:24:16 MDT
Moe,
thanks for the suggestion.
However the user wants to have control from within the cluster since he has a pretty complicated set of scripts to manage submission.

DV
Comment 6 Davide Vanzo 2016-10-25 08:58:04 MDT
Tim,
this issue is still open for us.
As I said in my previous reply the email route is not feasible. Is there a specific reason of this inconsistency? Is it a database issue?

Davide
Comment 7 Tim Wickberg 2016-10-25 09:30:35 MDT
Slurm does not have a mechanism for printing post-job statistics as described. I'm aware that other resource managers do have such a feature, and I've seen several similar requests on slurm-dev before. If desired I can open an enhancement request for it, but cannot say when/if this would be addressed.

There is a delay sending job statistics to the database, and the final collection time happens as the job completes. Running sstat during the epilog delays this final update to the database, so you're seeing slightly-old data.

The data for the individual steps themselves (launched through srun) will have been finalized, and I'd expect that to be usually consistent. But there is still a race condition between delivering the final result set to slurmdbd and the sstat query retrieving these values. There are a number of scalability and reliability issues that lead to this asynchronous behavior, and this asynchronous behavior will always mean that retrieving the accounting data while the job has not completed (it is not complete until the epilog finishes) may lead to inaccuracies such as what you're seeing here.

One potential alternative may be to, within the job, run the profiled portion as a separate step, and then use sstat within the batch script after that step has completed to retrieve the accounting information for that single step. You'd want to check that the step has been marked as complete, and retry until it has, to avoid similar issues. (Or you could just add a 'sleep 30' in between the srun and sstat calls and assume that's close enough.)
Comment 8 Davide Vanzo 2016-10-25 10:01:41 MDT
Tim,
this issue is still open for us.
As I said in my previous reply the email route is not feasible. Is there a specific reason of this inconsistency? Is it a database issue?

Davide
Comment 9 Davide Vanzo 2016-10-25 10:04:02 MDT
Tim,
I apologize for the noise. The message got sent twice.

Anyway, I will certainly join the club of the requestors of having the post-completion statistics option added to Slurm. That would certainly be helpful to many of our users and it would also be simpler to use.

Thanks.

Davide
Comment 10 Tim Wickberg 2016-10-25 10:21:37 MDT
One correction to my prior message - you'd want to use sacct, not sstat. sstat only works while the step is still executing.

I've opened 3207 to track that enhancement request, although there are some serious architectural hurdles to overcome to provide such a feature, and I can't say when/if we'd address that.
Comment 11 Davide Vanzo 2016-10-25 11:01:32 MDT
Perfect. Thanks Tim.

Davide
Comment 12 Tim Wickberg 2016-10-25 13:10:42 MDT
Was there anything else I can answer on this, or can I mark this one back to resolved? 

Bug 3207 will cover any potential enhancement to provide this internally; we've been discussing it internally and have a rough idea of how to implement that, but won't necessarily get to it.

If Vanderbilt would be interested in sponsoring that work that's a quick way to get it moved to the top of the list, we can take that discussion direct to email if so.
Comment 13 Tim Wickberg 2016-11-09 15:16:35 MST
Marking closed - bug 3207 will cover the potential enhancement.