Ticket 1659

Summary: sacctmgr load fail
Product: Slurm Reporter: Yann <yann.sagon>
Component: AccountingAssignee: David Bigagli <david>
Status: RESOLVED INFOGIVEN QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: brian, da
Version: 14.11.6   
Hardware: Linux   
OS: Linux   
Site: Université de Genève Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---
Attachments: purge file not working

Description Yann 2015-05-09 04:20:20 MDT
Dear team,

I'm trying to re load some job data that where available using sacct before automatic purge.

The files are there:

ls -la /var/lib/slurmdbd/

-rw-------   1 slurm slurm   149272 Jun  1  2013 baobab_step_archive_2013-04-01T00:00:00_2013-04-30T23:59:59
-rw-------   1 slurm slurm   226501 Jul  1  2013 baobab_step_archive_2013-05-01T00:00:00_2013-05-31T23:59:59
-rw-------   1 slurm slurm 11139245 Aug  1  2013 baobab_step_archive_2013-06-01T00:00:00_2013-06-30T23:59:59
[...]

I'm loading the file like that:

$sacctmgr load baobab_step_archive_2013-04-01T00:00:00_2013-04-30T23:59:59
 Nothing after object name '2'. line(10)
 Problem with requests: Unspecified error

Different error for every file, but the load of all of them fail.
Comment 1 David Bigagli 2015-05-11 08:52:01 MDT
Hi,
   what version of Slurm you ran when these files were produced? Could you send us one of them.

David
Comment 2 Yann 2015-05-12 02:17:15 MDT
Created attachment 1875 [details]
purge file not working
Comment 3 Yann 2015-05-12 02:19:31 MDT
We were using various SLURM version (range from 2013 to 2015)
Comment 4 David Bigagli 2015-05-12 06:18:37 MDT
Hi,
   the command 'sacctmgr load <file>' loads configuration file. The command
you are looking for is is 'sacctmgr archive load'. This indeed is not very clear from the documentation so we are going to update it.

David
Comment 5 Yann 2015-05-12 19:03:51 MDT
Thanks for that information.

I have tried randomly one or two file:

sacctmgr archive load baobab_job_archive_2013-11-01T00:00:00_2013-11-30T23:59:59
sacctmgr: error: Error with request.
 Problem loading archive file: Unspecified error
Comment 6 Yann 2015-05-12 19:15:25 MDT
After having a look at the slurmdbd.log, it's obvious.

[2015-05-13T09:05:09.144] error: mysql_query failed: 1153 Got a packet bigger than 'max_allowed_packet' bytes

I just set in my.cnf:

max_allowed_packet = 1G

and it seems to be working fine!

"unspecified error" could probably be improved in this case as the error is catched by slurmdbd.

Thanks
Comment 7 David Bigagli 2015-05-13 07:01:25 MDT
We can look at improving the message in the next release 15.08 as it needs
a change in protocol between slurmdbd and sacctmgr.

David