Upgrading the slurmdbd to 17.11.4 and when I tried to start slurmdbd I got the following errors: [2018-03-12T11:03:39.759] error: mysql_query failed: 1206 The total number of locks exceeds the lock table size update "brc_job_table" as job left outer join ( select job_db_inx, SUM(consumed_energy) 'sum_energy' from "brc_step_table" where id_step >= 0 and co nsumed_energy != 18446744073709551614 group by job_db_inx ) step on job.job_db_inx=step.job_db_inx set job.tres_alloc=concat(job.tres_alloc, concat( ',3=', case when step.sum_energy then step.sum_energy else 18446744073709551614 END)) where job.tres_alloc != '' && job.tres_alloc not like '%,3=%'; [2018-03-12T11:03:39.759] error: Can't convert brc_job_table info: Unknown error 1206 [2018-03-12T11:03:39.759] error: issue converting tables after create [2018-03-12T11:03:39.759] Accounting storage MYSQL plugin failed [2018-03-12T11:03:39.768] error: mysql_query failed: 1062 Duplicate entry '5' for key 'PRIMARY' update tres_table set id=5 where id=1001; [2018-03-12T11:03:39.816] error: Couldn't load specified plugin name for accounting_storage/mysql: Plugin init() callback failed [2018-03-12T11:03:39.819] error: cannot create accounting_storage context for accounting_storage/mysql [2018-03-12T11:03:39.819] fatal: Unable to initialize accounting_storage/mysql accounting storage plugin I restarted the process and it has been running since 11:37 this morning and it is still running. [2018-03-12T11:37:27.102] Warning: Note very large processing time from make table current "brc_job_table": usec=595063992 began=11:27:32.038 [2018-03-12T11:37:27.506] adding column pack_job_id after id_group in table "master_job_table" [2018-03-12T11:37:27.506] adding column pack_job_offset after pack_job_id in table "master_job_table" [2018-03-12T11:37:27.506] adding column mcs_label after kill_requid in table "master_job_table" [2018-03-12T11:37:27.506] adding column work_dir after wckey in table "master_job_table" [2018-03-12T11:37:27.506] adding key old_tuple (id_job, id_assoc, time_submit) to table "master_job_table" [2018-03-12T11:37:27.507] adding key pack_job (pack_job_id) to table "master_job_table" [2018-03-12T11:37:27.732] adding column pack_job_id after id_group in table "master.brc_job_table" [2018-03-12T11:37:27.732] adding column pack_job_offset after pack_job_id in table "master.brc_job_table" [2018-03-12T11:37:27.732] adding column mcs_label after kill_requid in table "master.brc_job_table" [2018-03-12T11:37:27.732] adding column work_dir after wckey in table "master.brc_job_table" [2018-03-12T11:37:27.732] adding key old_tuple (id_job, id_assoc, time_submit) to table "master.brc_job_table" [2018-03-12T11:37:27.732] adding key pack_job (pack_job_id) to table "master.brc_job_table" [2018-03-12T11:37:27.948] converting step table for 0-a0-d1-ec-bc-c [2018-03-12T11:37:27.948] converting job table for 0-a0-d1-ec-bc-c [2018-03-12T11:37:28.020] converting resv table for 0-a0-d1-ec-bc-c [2018-03-12T11:37:28.020] converting cluster tables for 0-a0-d1-ec-bc-c [2018-03-12T11:37:28.020] converting assoc table for 0-a0-d1-ec-bc-c [2018-03-12T11:37:28.020] converting step table for brc [2018-03-12T11:37:45.666] converting job table for brc It is now 20:29 which is over 8 hours since this was running. I mistakenly did not run it with the -D option so I can't tell what is happening. Do you think there is an issue with the upgrade or database conversion? Or is this normal? Thanks Jackie
Hey Jackie. We’ve traced this to the version of MySQL. See Bug 4877. Can you confirm that you are using MySQL 5.1?
We’re running MySQL version 5.1 Thanks Jackie Scoggins On Mar 12, 2018, at 8:40 PM, bugs@schedmd.com wrote: *Comment # 1 <https://bugs.schedmd.com/show_bug.cgi?id=4906#c1> on bug 4906 <https://bugs.schedmd.com/show_bug.cgi?id=4906> from Brian Christiansen <brian@schedmd.com> * Hey Jackie. We’ve traced this to the version of MySQL. See Bug 4877 <show_bug.cgi?id=4877>. Can you confirm that you are using MySQL 5.1? ------------------------------ You are receiving this mail because: - You reported the bug.
Do you have a backup of the database prior to the upgrade? If so, I would consider upgrading MySQL, restore the backup and restart the SlurmDBD -- as they did in Bug 4877 Comment 16. FYI. I still have 5.1 upgrade running from last Friday.
Yes we do have a backup from this morning. I don't know if we can upgrade mysql it is not just used for slurm. I will have to confer with my team to see if we can upgrade mysql. What version of mysql do you suggest?
ok. I have MySQL 5.7 and Wyoming said 5.7 worked for them so I think that's a safe choice. Let us know what you decide in the morning.
Were not running sl7 and 5.7 isn’t part of sl6 repos. So we’re looking into building it or getting one for sl6. Thanks Jackie Scoggins On Mar 12, 2018, at 8:40 PM, bugs@schedmd.com wrote: *Comment # 1 <https://bugs.schedmd.com/show_bug.cgi?id=4906#c1> on bug 4906 <https://bugs.schedmd.com/show_bug.cgi?id=4906> from Brian Christiansen <brian@schedmd.com> * Hey Jackie. We’ve traced this to the version of MySQL. See Bug 4877 <show_bug.cgi?id=4877>. Can you confirm that you are using MySQL 5.1? ------------------------------ You are receiving this mail because: - You reported the bug.
the mysql upgrade worked. Question is there any concerns with using TRES when upgrading from 16.x to 17.11.4? Just checking to see if this is an issue? [2018-03-12T23:12:01.055] error: _handle_qos_tres_run_secs: job 2176025: QOS acrb_gpu2_normal TRES billing grp_used_tres_run_secs underflow, tried to remove 600 seconds when only 0 remained.
We got it installed MySQL 5.7 restored the database from the dump taken that morning before the upgrade. And we’re up and running now. Thanks Jackie Scoggins On Mar 13, 2018, at 3:07 AM, bugs@schedmd.com wrote: Felip Moll <felip.moll@schedmd.com> changed bug 4906 <https://bugs.schedmd.com/show_bug.cgi?id=4906> What Removed Added CC felip.moll@schedmd.com ------------------------------ You are receiving this mail because: - You reported the bug.
Glad to hear that the upgrade was successful. We have a fix for the TRES issue that you saw that is slated for 17.11.5. Let us know if you have any questions. Thanks, Brian