| Summary: | Correct way in Epilogue to determine if this is the last job... | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Brad Viviano <viviano.brad> |
| Component: | Configuration | Assignee: | Tim McMullan <mcmullan> |
| Status: | RESOLVED INFOGIVEN | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | ||
| Version: | 19.05.5 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | EPA | Slinky Site: | --- |
| Alineos Sites: | --- | Atos/Eviden Sites: | --- |
| Confidential Site: | --- | Coreweave sites: | --- |
| Cray Sites: | --- | DS9 clusters: | --- |
| Google sites: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | NoveTech Sites: | --- |
| Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Tzag Elita Sites: | --- |
| Linux Distro: | --- | Machine Name: | |
| CLE Version: | Version Fixed: | ||
| Target Release: | --- | DevPrio: | --- |
| Emory-Cloud Sites: | --- | ||
| Attachments: | tar file containing our slurmd epilogue scripts. | ||
Created attachment 14568 [details]
tar file containing our slurmd epilogue scripts.
I've attached a tar of all our epilogue scripts. The one in question I am asking about is my "90-kill_on_exit" epilogue script. Basically, I would like the cleanup part of that script to run ONLY when it's being run by the last active job on the node. Thanks. Sorry,
One clarification, I meant to say that "CF" for Configuring in my check below. See updated code below:
HOSTNAME=`/bin/hostname`
JOB_COUNT=`/usr/local/bin/squeue --noheader --states=R,CF -w ${HOSTNAME} | /usr/bin/wc -l`
if [ ${JOB_COUNT} -eq 0 ]; then
...
fi
Again, the above seems to work correctly, but if multiple jobs all enter "COMPLETING" at the same time for the same node, the clean up process runs multiple times.
Hi!
I did some checking into this and unfortunately I'm not sure there is currently a "good" way to do this. The epilog is only really aware of itself, and querying the slurmctld on every job isn't ideal since its potentially a lot of load (depending on your job throughput). It might be possible to use the slurmrestd in 20.02 to make this a little better, but it will need to fetch that data from the slurmctld as well.
That said, my first thought on improving the script was to fetch "R,CF,CG" jobs in one go, then with judicious use of awk figure out if there are running jobs, and if not pick the last job in the "CG" state to do the cleanup.
I was playing with something like this, though I wouldn't use it without a lot more testing:
HOSTNAME=`/bin/hostname`
IFS=""
JOBS=`/usr/local/bin/squeue --noheader --states=R,CF,CG -w ${HOSTNAME}`
RUNNING_JOBS=`echo ${JOBS} | awk '{if ($5 == "R" || $5 == "CF") { i++ }}; END {print i}'`
LAST_JOB=`echo ${JOBS} | awk 'END {print $1}'
unset IFS
if [[ ${RUNNING_JOBS} -eq 0 ]] && [[ "${LAST_JOB}" == "${SLURM_JOB_ID}"; then
...
fi
There might be some issues with array jobs with that concept though. It might be possible to make it work the way you have it using flock as well, but I could imagine that still having races and the cleanup getting run more than once still.
I hope this helps!
Thanks,
--Tim
Hi! I just wanted to check and make sure this answered your question! Thanks! --Tim Yes, thanks. You can close the case. Thanks Brad! Closing now. |
Hello, Here is my scenario. Each of the nodes of our cluster is 32 cores. We don't allow users to share nodes, but do allow a user to run multiple jobs on the same node (i.e. multiple single core jobs). We have an epilogue script that, when the last job on a node completes, it "cleans" the node (purges /tmp, /var/tmp, etc) to make it ready for a new user. I will attach the slurmd epilogue script, but basically, what I was doing in my script was: HOSTNAME=`/bin/hostname` JOB_COUNT=`/usr/local/bin/squeue --noheader -w ${HOSTNAME} | /usr/bin/wc -l` if [ ${JOB_COUNT} -eq 1 ]; then ... #Do whatever cleanup is needed fi The issue I ran into, when multiple single core jobs completed at or around the same time, the output of the above squeue command would show jobs in "CG" and "R" state. Then as each finished, I could/would have X jobs in "CG" state, then 0 jobs at all. This would cause the epilogue script to fail to run. I switched the logic to be: HOSTNAME=`/bin/hostname` JOB_COUNT=`/usr/local/bin/squeue --noheader --states=R,CG -w ${HOSTNAME} | /usr/bin/wc -l` if [ ${JOB_COUNT} -eq 0 ]; then ... fi The above seems to have fixed the problem, except, it can cause the epilogue script to run the cleanup multiple times, but I am wondering if there is a better way. My question is. How do I determine in the Epilogue script, if ${SLURM_JOB_ID} is the last active job being run on that node by ${SLURM_JOB_USER} so I can run the cleanup process reliably. Thanks.