| Summary: | sacct gives wrong times using accounting_storage/filetxt | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Ulf Markwardt <Ulf.markwardt> |
| Component: | Accounting | Assignee: | Danny Auble <da> |
| Status: | RESOLVED FIXED | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | ||
| Version: | 2.6.x | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | Universitat Dresden (Germany) | Slinky Site: | --- |
| Alineos Sites: | --- | Atos/Eviden Sites: | --- |
| Confidential Site: | --- | Coreweave sites: | --- |
| Cray Sites: | --- | DS9 clusters: | --- |
| Google sites: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | NoveTech Sites: | --- |
| Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Tzag Elita Sites: | --- |
| Linux Distro: | --- | Machine Name: | |
| CLE Version: | Version Fixed: | ||
| Target Release: | --- | DevPrio: | --- |
| Emory-Cloud Sites: | --- | ||
| Attachments: |
slurm.conf
/home/mark/slurmgit/bin/sacct -X --format "JobName,JobID,Submit,Start, End,State" -S 2013-07-17T08:00:00 --name git_5 > git_5.sacct /tmp/slurmgit_JobCompLoc.log |
||
Created attachment 343 [details]
/home/mark/slurmgit/bin/sacct -X --format "JobName,JobID,Submit,Start, End,State" -S 2013-07-17T08:00:00 --name git_5 > git_5.sacct
Created attachment 344 [details]
/tmp/slurmgit_JobCompLoc.log
I'll check this out later. As mentioned before, I am unaware of anyone using this plugin in production, but will fix the bug :). You will probably find many more issues as the code hasn't been really touched in many years. Your site has been in the drop down for a while, I just changed it now. Of course, you are free to say you don't support filetxt any more. But until then I feel it provides a good testing environment. Thank you Ulf Thanks for the option, I agree it is slightly simpler to setup than the database. But perhaps we should consider taking it away as it typically doesn't represent a real production system. We are deprecating the postgres plugin in the next version as well for similar reasons. Hm... my confidence in the correctness of sacct dropped a little, when it gave these outputs with unknown origin. For this reason: Please call the plugin deprecated (officially) or support it. To relax the urgency: I will not need the fix for the next 3 weeks. Thank you Ulf I can understand. I am proposing we just deprecate the plugin. I would change your statement to "my confidence in the correctness of sacct with the filetxt plugin dropped a little..." :). I would be surprised if these issues were happening with a regular slurmdbd/mysql setup. Ulf could you attach the filetxt file for this (/tmp/slurmgit_AccountingStorageLoc)? I should of asked for it before, but having that will give me the ability to reproduce. I really only need the lines one of the jobs in question like jobid 500 for instance. According to your slurm.conf the file is AccountingStorageLoc=/tmp/slurmgit_AccountingStorageLoc but the file you sent did display the situation. I was able to reproduce and fix the problem it is in commit 9eba4384fe0fad228fd570207f63e18d043880fc and will be in 2.6.1. Let me know if you have any more issues. Thank you! |
Created attachment 342 [details] slurm.conf Dear Slurm support, with this morning's checkout I have run the next large scenario. With this I detected mysetrious discrepancies between the JobCompLoc file and the output of sacct. For example: taurusi1035 /tmp grep "JobId=4623" /tmp/slurmgit_JobCompLoc.log JobId=4623 UserId=mark(19423) GroupId=swtest(50147) Name=git_5 JobState=COMPLETED Partition=all TimeLimit=10 StartTime=2013-07-17T17:40:41 EndTime=2013-07-17T17:43:04 NodeList=taurusi1151 NodeCnt=1 ProcCnt=1 WorkDir=/scratch/mark/44 and /home/mark/slurmgit/bin/sacct -X --format "JobName,JobID,Submit,Start, End,State" -S 2013-07-17T08:00:00 |grep " 4623" git_5 4632 2013-07-17T17:39:00 2013-07-17T17:38:19 2013-07-17T17:40:41 COMPLETED (The clock difference - tested with "clush -bw taurusi[3001-3180],taurusi[1001-1270] date" - is max a second.) --- My major problem with the sacct output is that the submit time is AFTER the start time. Please fix this bug. Thank you Ulf --- After how many bug reports will our site be listed in the drop down field :-)