| Summary: | hanging comma in the sacct joblist causes slurmdbd to crash | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Jeff Tan <jeffetan> |
| Component: | slurmdbd | Assignee: | Danny Auble <da> |
| Status: | RESOLVED FIXED | QA Contact: | |
| Severity: | 3 - Medium Impact | ||
| Priority: | --- | CC: | brian, da |
| Version: | 14.11.8 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | VLSCI | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | 14.11.10 15.08.2 16.05.0-pre1 | Target Release: | --- |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
|
Description
Jeff Tan
2015-10-04 14:35:59 MDT
I cannot reproduce the core dump in neither in 14.03 not in 14.11.8. Perhaps the memory issue is somewhere else. Could you append your slurmdbd.conf? 14.11.9 prints all jobs indeed but this problem appears to be fixed in 15.08. David My guess is you have many jobs in your system. You might want to consider looking at using the purging/archiving functionality of the DBD, http://slurm.schedmd.com/slurmdbd.conf.html I can reproduce the issue with giving all jobs back. I made a commit 2646e7615885ad4 that will fix the scenarios like sacct -X -j 132423, You will need to upgrade to 15.08 for sacct -X -j, to be fixed though. The real fix has to be made to sacct though, so any older version of the code will have this anomaly. FYI, in 15.08 sacct -X -j 132423, will be rejected with sacct: fatal: Bad job/step specified. We can probably change that to just not accept the empty one though which would probably be better. I'll see if I can alter that in 15.08. Thanks, David and Danny. I'm guessing David is unable to replicate this behavior because our database has never been purged. I hesitate to open up the job tables via mysql directly these days, but these jobs go way back. We'll give commit 2646e7615885ad4 a go and perhaps craft something extra for the empty joblist with just the comma given. An upgrade to 15.08 is probably not happening for us until January. Thanks again! Regards Jeff Author: Danny Auble <da@schedmd.com> Date: Mon Oct 5 16:50:43 2015 -0700 Fix sacct to not return all jobs if the -j option is given with a trailing ','. David I would still like to look at this more from the sacct side. Ah ok, you mean fix the syntax on the sacct side. David This is now fixed in commit 2dcc2732c1bca for 15.08. I also added a commit to 14.11 in commit d5979ef68c24 which will fix sacct in 14.11 if you are interested in it there, it will be in 14.11.10 if that ever gets tagged. |