I'm trying to print how much memory a job takes to the user's output with an epilog script. As a test, I'm just trying to print "hello world" into the job's output. In my slurm.conf file I have the line: Epilog=/opt/slurmScripts/epilog.sh Which defines the epilog script. Then, in each of my nodes I have epilog.sh located at /opt/slurmScripts. I know the script runs, because when epilog.sh is just: echo "test" > /tmp/slurm the /tmp/slurm file gets written to. But, if I have the contents of epilog.sh be: echo "Hello World" or print "Hello World" I don't see "Hello World" printed in the jobs output (or anywhere). The documentation at https://slurm.schedmd.com/prolog_epilog.html says that print ... writes to the task's standard output. Is there anyway I can print information to the job's output?
Hi Ian, The print functionality is for task prologs specifically. For your reference there is a FAQ about using this for the TaskProlog here: https://slurm.schedmd.com/faq.html#task_prolog I know that doesn't help with what you are trying to do in the epilog. Writing to the job output file is limited to the TaskProlog because the file gets closed before the epilog scripts run. There have been some requests to add the ability to write to the job output file in more cases, but it's not currently available. I don't know if your users have varying names and locations they user for their output files. If there is some sort of standard location they use then you should be able to have the epilog script write to the job's output file. Here's an example of how that might look: #!/bin/bash OUTPUT=/home/$SLURM_JOB_USER/slurm/slurm-$SLURM_JOBID.out echo "------------------" >> $OUTPUT echo "In epilog" >> $OUTPUT date >> $OUTPUT env | grep SLURM >> $OUTPUT exit 0 Does that look like something that might work or do your output locations vary too much? Thanks, Ben Roberts
I don't think writing to the user's shared file system would work for us in this case. The purpose of printing the memory usage to the users is so that they're more aware about how much memory they're using. The documentation seemed a little vague about print not being available for epilog scripts, maybe it could be changed to state this more explicitly? Since I can't use an epilog script, do you know which plugin interface would be best for this situation?
Hi Ian, One possible solution, if there isn't a standard location you can expect the output files to be written to, is to run 'scontrol show job <jobid>' in the epilog to get the StdOut/StdErr path. This will add some overhead but should let you get the file location you need. If that's not the way you want to go then you may want to look at the SPANK plugin architecture. This gives you several options of contexts in which to run a script in the life of a job. You can find more information about this plugin here: https://slurm.schedmd.com/spank.html I agree that the wording in the documentation could be clearer that it's limited to the task prolog. I've put together a patch to clarify that. Please let me know if you have any further questions about this. Thanks, Ben
Hi Ian, The documentation clarification patch I put together was checked in today. You can find details of the commit here: https://github.com/SchedMD/slurm/commit/569167fc51c64ffe54a10a3454eb2bb977133448 Since that has been clarified and I haven't heard any additional questions I'll go ahead and close this ticket. Feel free to let me know if you do have questions about this. Thanks, Ben