| Summary: | Cannot report on pending time in queue | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Chris Holder <christopher.holder> |
| Component: | Accounting | Assignee: | Benjamin Witham <benjamin.witham> |
| Status: | RESOLVED INFOGIVEN | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | CC: | benjamin.witham |
| Version: | - Unsupported Older Versions | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | Baylor College of Medicine Molecular and Human Genetics | Slinky Site: | --- |
| Alineos Sites: | --- | Atos/Eviden Sites: | --- |
| Confidential Site: | --- | Coreweave sites: | --- |
| Cray Sites: | --- | DS9 clusters: | --- |
| Google sites: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | NoveTech Sites: | --- |
| Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Tzag Elita Sites: | --- |
| Linux Distro: | --- | Machine Name: | |
| CLE Version: | Version Fixed: | ||
| Target Release: | --- | DevPrio: | --- |
| Emory-Cloud Sites: | --- | ||
|
Description
Chris Holder
2023-08-02 10:03:22 MDT
Hello Chris, The squeue command allows for users to build their own squeue with the -O or --Format= feature. There are a few different options for times. > https://slurm.schedmd.com/squeue.html#OPT_Format It think that this is the one you're looking for, but I have a few other options as well. PendingTime - Shows the time in seconds that a job has been waiting in the queue. If the job has started, it prints the difference between the start time and submission time of the job. > https://slurm.schedmd.com/squeue.html#OPT_PendingTime StartTime - This is the projected time that the job will start if the job is pending, or the time that the job started if the job is running. > https://slurm.schedmd.com/squeue.html#OPT_StartTime EndTime - This is the projected time that the job will end. This is found based on the time limits of the submitted jobs. > https://slurm.schedmd.com/squeue.html#OPT_EndTime The --start option of squeue may also be a good starting point to look at. This will only print pending jobs and their expected start time. > https://slurm.schedmd.com/squeue.html#OPT_start Does this answer your question? Forgive my ignorance. I have been working on the assumption that squeue would only show "live" jobs and once a job has completed it would be up to the accounting database to give out historical information. I am trying to build and ongoing report for various job statistics (by account) so that I can provide those to my users as well as assess the ongoing trends in the cluster's performance. Hello Chris, I believe I misunderstood your original question, I did not realize you were looking to get data on completed jobs. You are correct, squeue is only for jobs that are pending and running, and once completed the information about that job can be retrieved with sacct. Allow me to look there for a moment. Hello Chris,
I have not found anything that will display the amount of time that a completed job was pending in the queue. The best option is to get the SubmitTime and StartTime from sacct and take the difference between the two. I would suggest using the -p or -P options to help with parsing the data.
Have you looked into sreport as well? There are tools there that could help you create your report.
> https://slurm.schedmd.com/sreport.html
sreport is the BANE OF MY EXISTENCE!! I can't get that stupid thing to be even remotely useful or consistent. I have no doubt in my mind that I am just doing it wrong, but I have yet to be able to find a tutorial that doesn't read like a doctorate level engineering thesis. I mean seriously... It's impossible to find documentation that is consumable by normal humans of only slightly above-average intelligence. I think the last time I jumped into sreport it was spitting out reporting based on POSIX groups instead of the actual slurm account association. Also, when I ran it from the CLI it spit out data but when I added it to a cron job it's nothing but headers with empty data tables. If you have some human-readable documentation or recommendations, I would really appreciate it. Hello Chris, I apologize for the late response. I agree that the sreport documentation is confusing. Which parts of it are you finding most difficult. What command were you running in your cron job and how often was your cron job running? Sorry for the delay. Here’s my crontab entry: 0 0 1 * * /var/opt/Slurm_tools/slurmreportmonth/slurmreportmonth -m When run from crontab the data tables are empty. When run from a CLI it populates data. Thanks, Chris From: bugs@schedmd.com <bugs@schedmd.com> Sent: Wednesday, August 16, 2023 12:57 PM To: Holder, Christopher Michael <Christopher.Holder@bcm.edu> Subject: [Bug 17337] Cannot report on pending time in queue ***CAUTION:*** This email is not from a BCM Source. Only click links or open attachments you know are safe. ________________________________ Comment # 7<https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.schedmd.com_show-5Fbug.cgi-3Fid-3D17337-23c7&d=DwMFaQ&c=ZQs-KZ8oxEw0p81sqgiaRA&r=2ZdeynACRgILMr62dx9xaTyPVxGWiPYfLvXORmnH2Vs&m=1-1RYAkNI2AMj4KOWqpiORuwYBgedChpr9inSMFL4C4nt5py9XSkdBPW2YvDrrVB&s=EVBD0CLcz9vHAi5XdCPuF8EY8gId_D1qMOnSQE9YQnw&e=> on bug 17337<https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.schedmd.com_show-5Fbug.cgi-3Fid-3D17337&d=DwMFaQ&c=ZQs-KZ8oxEw0p81sqgiaRA&r=2ZdeynACRgILMr62dx9xaTyPVxGWiPYfLvXORmnH2Vs&m=1-1RYAkNI2AMj4KOWqpiORuwYBgedChpr9inSMFL4C4nt5py9XSkdBPW2YvDrrVB&s=844eCBdvUEwZi9VrPwO8q7XyJkcUriNM1aQlrv7jNQw&e=> from Benjamin Witham<mailto:benjamin.witham@schedmd.com> Hello Chris, I apologize for the late response. I agree that the sreport documentation is confusing. Which parts of it are you finding most difficult. What command were you running in your cron job and how often was your cron job running? ________________________________ You are receiving this mail because: * You reported the bug. Hello Chris, Are you able to send me the exact sreport command that is run in your slurmreportmonth file? sreport cluster utilization Start=$START End=$END -t percent > $REPORT sreport -t hourper --tres=cpu,gpu cluster AccountUtilizationByUser Start=$START End=$END format=Accounts,Cluster,Login,Proper%30,TresName,Used tree >> $REPORT Thanks, Chris From: bugs@schedmd.com <bugs@schedmd.com> Sent: Tuesday, August 22, 2023 12:35 PM To: Holder, Christopher Michael <Christopher.Holder@bcm.edu> Subject: [Bug 17337] Cannot report on pending time in queue ***CAUTION:*** This email is not from a BCM Source. Only click links or open attachments you know are safe. ________________________________ Comment # 9<https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.schedmd.com_show-5Fbug.cgi-3Fid-3D17337-23c9&d=DwMFaQ&c=ZQs-KZ8oxEw0p81sqgiaRA&r=2ZdeynACRgILMr62dx9xaTyPVxGWiPYfLvXORmnH2Vs&m=XQDEce2kMgpBXVotDySyIVPk-fnEyPTmR1iMLzFHgVoMTdix2myhA_B7vlDUrP8B&s=M-_iDk5NakYt9AntzZqxb9qVsnC8L7rel_hLigrtSgI&e=> on bug 17337<https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.schedmd.com_show-5Fbug.cgi-3Fid-3D17337&d=DwMFaQ&c=ZQs-KZ8oxEw0p81sqgiaRA&r=2ZdeynACRgILMr62dx9xaTyPVxGWiPYfLvXORmnH2Vs&m=XQDEce2kMgpBXVotDySyIVPk-fnEyPTmR1iMLzFHgVoMTdix2myhA_B7vlDUrP8B&s=3IEcEytNx4vsnUzhC-XfrWFbL64xEKN9kHTETOODVpc&e=> from Benjamin Witham<mailto:benjamin.witham@schedmd.com> Hello Chris, Are you able to send me the exact sreport command that is run in your slurmreportmonth file? ________________________________ You are receiving this mail because: * You reported the bug. Hello Chris, Where are you getting your $START and $END times from? The only way I'm able to (somewhat) reproduce the tables with no bodies is if the times that I set are bad (as have not happened yet). Is crontab feeding your sreport times for the next month and not the previous one? Yes, it is. Let me take a look at that. Thanks, Chris From: bugs@schedmd.com <bugs@schedmd.com> Sent: Monday, September 4, 2023 5:10 PM To: Holder, Christopher Michael <Christopher.Holder@bcm.edu> Subject: [Bug 17337] Cannot report on pending time in queue ***CAUTION:*** This email is not from a BCM Source. Only click links or open attachments you know are safe. ________________________________ Comment # 11<https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.schedmd.com_show-5Fbug.cgi-3Fid-3D17337-23c11&d=DwMFaQ&c=ZQs-KZ8oxEw0p81sqgiaRA&r=2ZdeynACRgILMr62dx9xaTyPVxGWiPYfLvXORmnH2Vs&m=wliR6YQZJLkgQR98ngVkVn3OJ2M_Av8r3hIVi9Ju3gyTGS963gvaPg9Pj0VZywEh&s=Otyq2DjUPxAZa_ZSbKnSpqIm-_d7FEkcsi652qVpMWQ&e=> on bug 17337<https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.schedmd.com_show-5Fbug.cgi-3Fid-3D17337&d=DwMFaQ&c=ZQs-KZ8oxEw0p81sqgiaRA&r=2ZdeynACRgILMr62dx9xaTyPVxGWiPYfLvXORmnH2Vs&m=wliR6YQZJLkgQR98ngVkVn3OJ2M_Av8r3hIVi9Ju3gyTGS963gvaPg9Pj0VZywEh&s=wxQVdWetpcRXXPCkuQs5rycXsWnQTFf_iKd_Z7IH_qE&e=> from Benjamin Witham<mailto:benjamin.witham@schedmd.com> Hello Chris, Where are you getting your $START and $END times from? The only way I'm able to (somewhat) reproduce the tables with no bodies is if the times that I set are bad (as have not happened yet). Is crontab feeding your sreport times for the next month and not the previous one? ________________________________ You are receiving this mail because: * You reported the bug. Hey Chris, Are you still having trouble with your crontab job? Hello Chris, I haven't heard from you, so I'll close this ticket now. If you're still having trouble with sreport, feel free to reopen this ticket. |