Ticket 1659 - sacctmgr load fail
Summary: sacctmgr load fail
Status: RESOLVED INFOGIVEN
Alias: None
Product: Slurm
Classification: Unclassified
Component: Accounting (show other tickets)
Version: 14.11.6
Hardware: Linux Linux
: 4 - Minor Issue
Assignee: David Bigagli
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2015-05-09 04:20 MDT by Yann
Modified: 2015-05-13 07:01 MDT (History)
2 users (show)

See Also:
Site: Université de Genève
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments
purge file not working (96.42 KB, application/octet-stream)
2015-05-12 02:17 MDT, Yann
Details

Note You need to log in before you can comment on or make changes to this ticket.
Description Yann 2015-05-09 04:20:20 MDT
Dear team,

I'm trying to re load some job data that where available using sacct before automatic purge.

The files are there:

ls -la /var/lib/slurmdbd/

-rw-------   1 slurm slurm   149272 Jun  1  2013 baobab_step_archive_2013-04-01T00:00:00_2013-04-30T23:59:59
-rw-------   1 slurm slurm   226501 Jul  1  2013 baobab_step_archive_2013-05-01T00:00:00_2013-05-31T23:59:59
-rw-------   1 slurm slurm 11139245 Aug  1  2013 baobab_step_archive_2013-06-01T00:00:00_2013-06-30T23:59:59
[...]

I'm loading the file like that:

$sacctmgr load baobab_step_archive_2013-04-01T00:00:00_2013-04-30T23:59:59
 Nothing after object name '2'. line(10)
 Problem with requests: Unspecified error

Different error for every file, but the load of all of them fail.
Comment 1 David Bigagli 2015-05-11 08:52:01 MDT
Hi,
   what version of Slurm you ran when these files were produced? Could you send us one of them.

David
Comment 2 Yann 2015-05-12 02:17:15 MDT
Created attachment 1875 [details]
purge file not working
Comment 3 Yann 2015-05-12 02:19:31 MDT
We were using various SLURM version (range from 2013 to 2015)
Comment 4 David Bigagli 2015-05-12 06:18:37 MDT
Hi,
   the command 'sacctmgr load <file>' loads configuration file. The command
you are looking for is is 'sacctmgr archive load'. This indeed is not very clear from the documentation so we are going to update it.

David
Comment 5 Yann 2015-05-12 19:03:51 MDT
Thanks for that information.

I have tried randomly one or two file:

sacctmgr archive load baobab_job_archive_2013-11-01T00:00:00_2013-11-30T23:59:59
sacctmgr: error: Error with request.
 Problem loading archive file: Unspecified error
Comment 6 Yann 2015-05-12 19:15:25 MDT
After having a look at the slurmdbd.log, it's obvious.

[2015-05-13T09:05:09.144] error: mysql_query failed: 1153 Got a packet bigger than 'max_allowed_packet' bytes

I just set in my.cnf:

max_allowed_packet = 1G

and it seems to be working fine!

"unspecified error" could probably be improved in this case as the error is catched by slurmdbd.

Thanks
Comment 7 David Bigagli 2015-05-13 07:01:25 MDT
We can look at improving the message in the next release 15.08 as it needs
a change in protocol between slurmdbd and sacctmgr.

David