Created attachment 6303 [details] slurmdbd.log We'd been aprehensive about upgrading the database with the conversations on the slurm-users mailing list, but we had believed our database was fairly small and slurmdbd on a real machine with plenty of hardware cycles to spare, plenty of memory, etc. However, we started at 1:45 PM MST yesterday and at 8:00 AM MST next morning, the database upgrade is still running. Output of the log is attached. The SQL process that seems to be stuck is as follows: mysql> show full processlist; +--------+-----------+-----------------------+---------------+---------+-------+--------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Id | User | Host | db | Command | Time | State | Info | +--------+-----------+-----------------------+---------------+---------+-------+--------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 443756 | slurmacct | mmmmgt1.cluster:39050 | slurm_acct_db | Query | 65729 | Sending data | update "mtmoran_job_table" as job left outer join ( select job_db_inx, SUM(consumed_energy) 'sum_energy' from "mtmoran_step_table" where id_step >= 0 and consumed_energy != 18446744073709551614 group by job_db_inx ) step on job.job_db_inx=step.job_db_inx set job.tres_alloc=concat(job.tres_alloc, concat(',3=', case when step.sum_energy then step.sum_energy else 18446744073709551614 END)) where job.tres_alloc != '' && job.tres_alloc not like '%,3=%' | | 444263 | root | localhost | NULL | Query | 0 | NULL | show full processlist | +--------+-----------+-----------------------+---------------+---------+-------+--------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Hey Jared, Our recommendation is to let it keep going for now. We've seen that the rollback will take a long time if you stop it. That sql statement is expected. A few other questions: 1. How many job records are in the database. 2. How many step records are in the database. 3. What type of hard drive are you using. 4. What type of cpu do you have? 5. Does top show that mysql is working? Thanks, Brian
we have 1.4M job records, 2.2M step records. Dedicated HDD mirror, dual socket 2.6GHz Xeon Sandy Bridge processors, and 64 GB memory. Top shows mysqld running at about 99% of a single cpu. The goal here is to pre-migrate (upgrade) the database before moving it to new hardware.
Just for FYI: "As a reference point, one of our databases with 3M jobs on an nvme took 12 minutes to finish the upgrade (Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz). The exact same database on a different hardware took around 45 minutes, so hardware matters a lot. Another reference, a customer with 5M records took about 2 hours."(Bug 4846) Also, what version of mysql/mariadb are you using? It's good to see that the mysqld is busy.
We are using mysql-5.1.73-8.el6_8.x86_64 In process to move to mariadb on RHEL 7.4 here in the next week or so. I'm assuming that the preferred version is the one delivered by Red Hat?
brian@lappy:~$ docker run centos:7 yum -q info mariadb | grep Version Version : 5.5.56 is fine. In reference to the 3M records in 12 minutes instance, the system is using MariabDB 10.2. It's unfounded, but we wonder if that contributed to the speediness of the upgrade as well. The best thing to do is to wait for the dbd to finish the upgrade. Let us know when it finishes. The one concern with the upgrade taking a long time is the records intended for the dbd building up. Do you see any of these messages in the slurmctld logs? error("slurmdbd: agent queue filling (%d), RESTART SLURMDBD NOW", cnt); error("slurmdbd: agent queue is full (%u), discarding %s:%u request", cnt, The slurmctld will hold on to the records until the slurmdbd is backup. It'll hold up to: MAX(10000, ((MaxJobCount * 2) + (node_record_count * 4))); before it start's discarding records. When the queue is more half the queue size, the "RESTART SLURMDBD NOW" message will appear. If needed, MaxJobCount could be increased to give it more buffer.
I don't see any of those messages in the logs, but the upgrade is still attempting to happen on the same process command.
Good to hear that you aren't seeing those messages. Do you happen to have a backup of the database? Would you be willing to send it to me so I can try it locally? I can make the bug private or you can email it to me directly if you like. I created a database yesterday with 1 million job records with 2 steps each and the conversion took 96 seconds. I'm attempting the conversion on a couple of other machines too.
Here's some results from using your database. mysql 5.7.21 CPU: i7-7700HQ CPU @ 2.80GHz Disk: nvme Time: 195518083 usecs CPU: i7-2600K CPU @ 3.40GHz Disk: ssd Time: 402331241 usecs CPU: i7-2600K CPU @ 3.40GHz Disk: hdd (database symlinked to filesystem on hdd) Time: 402331241 usecs I uploaded (drive) the converted database if you want to look at it. Maybe the mirror is causing it to be slow?
This is still going. Is there a parameter within slurm.conf that is adversely affecting this? The mailing list referred to AccountingStorageTRES which in our environment is set to cpu,mem,energy,node (default). We don't really care about the energy consumption yet, but if that's the element in question, then should it be removed from the default if an update takes this long?
You actually can't remove the energy TRES -- it's on no matter what (Billing, CPU, Energy, Memory, and Node are always tracked). Some background: Prior to 17.11 the slurmdbd rollups and sacct would get the energy from the steps with a join on the job and step tables but we found this to be pretty expensive on large tables. Now in 17.11, the energy is summed up at the end of the job and recorded on the job record. Now the database doesn't need to look at the steps for energy. It's big gain in performance after the upgrade, but the upgrade is being troublesome for a few systems. I'm still investigating what's causing the slowness. My next thing to try is using MySQL 5.1. If you're willing to do it, you can use the upgrade database that I uploaded -- I just uploaded a new one (wyoming2.sql.gz) as the first one might have done some purging. Steps: 1. Stop the dbd. 2. kill the sql query (ID = 443756?) 3. Stop the mysqld -- to stop the rollback 4. drop the current database (or create a new one next to it) 5. create database 6. import the database 7. start the slurmdbd. Let me know if you do this or not. Thanks, Brian Here's some output from my import of your database: Before upgrade: mysql> select count(*) from mtmoran_job_table; +----------+ | count(*) | +----------+ | 1417646 | +----------+ 1 row in set (0.21 sec) mysql> select count(*) from mtmoran_step_table; +----------+ | count(*) | +----------+ | 2142859 | +----------+ 1 row in set (0.62 sec) mysql> select tres_alloc from mtmoran_job_table order by job_db_inx limit 1; +------------+ | tres_alloc | +------------+ | 1=0 | +------------+ 1 row in set (0.00 sec) After upgrade: mysql> select count(*) from mtmoran_job_table; +----------+ | count(*) | +----------+ | 1417646 | +----------+ 1 row in set (0.19 sec) mysql> select count(*) from mtmoran_step_table; +----------+ | count(*) | +----------+ | 2142859 | +----------+ 1 row in set (0.66 sec) mysql> select tres_alloc from mtmoran_job_table order by job_db_inx limit 1; +----------------------------+ | tres_alloc | +----------------------------+ | 1=0,3=18446744073709551614 | +----------------------------+ 1 row in set (0.00 sec)
I'm currently upgrading your database on a 5.1 mysql server and it's been running for 25+ minutes and is still not done. This is on my system with the nvme. I also started an upgrade using an hdd as well. I'll let you know when it finishes. But I'm thinking that 5.1 is the culprit.
If that query is still running, will you grab the output of? show engine innodb status;
Created attachment 6329 [details] innodb status
I was curious what your buffer pool size looked like. It looks pretty small. My 5.1 version was set to the same as well. Buffer pool size 512 Free buffers 0 Database pages 200 The 5.1 upgrade was still happening (after ~5 hours), so I stopped it and increased the buffer pool size to 8G in hopes that the upgrade would happen a lot quicker. While I'm guessing it will still help, the upgrade is still running after 40 minutes. e.g. Buffer pool size 524288 Free buffers 370491 Database pages 152798 mysql> show variables like 'innodb_buffer_pool_%'; +-------------------------+------------+ | Variable_name | Value | +-------------------------+------------+ | innodb_buffer_pool_size | 8589934592 | +-------------------------+------------+ 1 row in set (0.01 sec) My 5.7 MySQL buffer_pool_size is actually a lot smaller and still out performs 5.1: mysql> show variables like 'innodb_buffer_pool_size'; +-------------------------+-----------+ | Variable_name | Value | +-------------------------+-----------+ | innodb_buffer_pool_size | 134217728 | +-------------------------+-----------+ 1 row in set (0.02 sec)
Have you considered using the database I sent you? The main concern is losing records in the slurmctld while the upgrade is still happening. Just keep watching for the agent queue messages.
We had thought about it. Since you mentioned MySQL 5.1 issues, we upgraded to MySQL 5.7 later this afternoon, dropped the slurm accounting database, restored from our earlier sqldump and then started the upgrade process again. It took ~16 minutes to complete and we're back to fully functional state.
Awesome! Glad you were able to get it up and going. Thanks for your help and patience. Let us know if you need anything else.
Thanks again for your help. I'm going to mark this as resolved. Let us know if anything else comes up. FYI. My 5.1 databases are still upgrading.