Ticket 1530

Summary: Jobs Data Reporting
Product: Slurm Reporter: rl303f
Component: ConfigurationAssignee: Brian Christiansen <brian>
Status: RESOLVED INFOGIVEN QA Contact:
Severity: 3 - Medium Impact    
Priority: --- CC: brian, da, rod, sfellini
Version: 14.11.3   
Hardware: Linux   
OS: Linux   
Site: NIH Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description rl303f 2015-03-12 05:34:43 MDT
Our transition to Slurm is from PBSPro which has some features
that we find useful and would like to emulate within Slurm.

1)  An invaluable tool for us under PBSPro is their 'tracejob'
command which succinctly summarizes the entire history of a
job.  What is the slurm equivalent to 'tracejob'?  If it does
not yet exist, are there any plans for development?

2)  At the end of a job, PBS includes information in the email
to the user as shown:

    PBS Job Id: 8735160.cluster
    Job Name:   test_job
    Execution terminated
    Exit_status=271
    resources_used.cpupercent=398
    resources_used.cput=00:00:50
    resources_used.mem=109212kb
    resources_used.ncpus=2
    resources_used.vmem=660060kb
    resources_used.walltime=00:01:31

How do we configure Slurm to include this job information
in the email to user?

3)  Similarly, what are best practices for configuring slurm
to include resources used information in jobs' slurm.out file?

Thank you!
Comment 1 Brian Christiansen 2015-03-16 08:36:10 MDT
1. There currently isn't a tool like tracejob. There is an open feature request for similar behavior -- Bug 980. To get the usage information like at the end of tracejob, you can use sacct.

2, 3. Currently there isn't a good way to do this. The possible workaround is to create a Prolog that gets the information from sacct and send it's own email. However, it's possible that the information might not be in the database yet. I have created a feature request, Bug 1536, for this request as we've discussed solutions to this issue in the past.

Please reopen if you have anymore questions.

Thanks,
Brian
Comment 2 Brian Christiansen 2015-03-16 09:07:43 MDT
err. I meant EpilogSlurmctld and not Prolog.