Ticket 5824

Summary: Slurmdbd hangs - DETECTED DEADLOCK
Product: Slurm Reporter: Damien <damien.leong>
Component: slurmdbdAssignee: Marshall Garey <marshall>
Status: RESOLVED INFOGIVEN QA Contact:
Severity: 3 - Medium Impact    
Priority: ---    
Version: 17.11.4   
Hardware: Linux   
OS: Linux   
Site: Monash University Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description Damien 2018-10-08 18:40:00 MDT
Good Afternoon Slurm Support

Last week, there was a case where the slurmdbd process hung and there were deadlocks in the corresponding MariaDB-SQL.

We were trying to mass-change(using sacctmgr) a whole lot of user-accounts to a different default-accounts and remove their account=default.

Our primary slurmdbd is at m3-mgmt2, while the backup slurmdbd is at m3-mgmt1


These are the observations: 

180925 15:46:22
*** (1) TRANSACTION:
TRANSACTION 265DBE7, ACTIVE 0 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 7 lock struct(s), heap size 1248, 5 row lock(s), undo log entries 3
MySQL thread id 436243, OS thread handle 0x7f965fcdc700, query id 63356859 m3-mgmt2.massive.org.au 172.16.200.79 slurmdb Updating
UPDATE "m3_assoc_table" SET rgt = rgt+2 WHERE rgt > 437 && deleted < 2
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 0 page no 723 n bits 256 index "PRIMARY" of table "slurm_acct_db"."m3_assoc_table" trx id 265DBE7 lock_mode X waiting
*** (2) TRANSACTION:
TRANSACTION 265DBE8, ACTIVE 0 sec fetching rows
mysql tables in use 1, locked 1
16 lock struct(s), heap size 3112, 1787 row lock(s), undo log entries 1560
MySQL thread id 436244, OS thread handle 0x7f96b8285700, query id 63356858 m3-mgmt2.massive.org.au 172.16.200.79 slurmdb Updating
UPDATE "m3_assoc_table" SET rgt = rgt+2 WHERE rgt > 437 && deleted < 2
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 0 page no 723 n bits 256 index "PRIMARY" of table "slurm_acct_db"."m3_assoc_table" trx id 265DBE8 lock_mode X
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 0 page no 1284 n bits 264 index "PRIMARY" of table "slurm_acct_db"."m3_assoc_table" trx id 265DBE8 lock_mode X waiting
*** WE ROLL BACK TRANSACTION (1)
------------


Slurmdbd stuck:

| 713991 | slurmdb | m3-mgmt2.massive.org.au:48916 | slurm_acct_db | Query   | 6351 | Waiting for table metadata lock | alter table "m3_assoc_table" AUTO_INCREMENT=0                                                        |    0.000 |
| 713993 | slurmdb | m3-mgmt2.massive.org.au:48924 | slurm_acct_db | Sleep   | 6350 |                                 | NULL                                                                                                 |    0.000 |
| 713995 | slurmdb | m3-mgmt2.massive.org.au:48928 | slurm_acct_db | Query   | 6350 | Waiting for table metadata lock | select distinct t1.lft, t1.rgt from "m3_assoc_table" as t1 where t1.deleted=0 && (t1.acct='default') |    0.000 |

MariaDB [(none)]> SHOW ENGINE INNODB STATUS;
+--------+------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| InnoDB |      |
=====================================
181001 17:57:57 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 51 seconds
-----------------

LOG
---
Log sequence number 37410223571
Log flushed up to   37410223571
Last checkpoint at  37410219174
Max checkpoint age    7782360
Checkpoint age target 7539162
Modified age          4397
Checkpoint age        4397
0 pending log writes, 0 pending chkp writes
3075173 log i/o's done, 0.53 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 137756672; in additional pool allocated 0
Total memory allocated by read views 656
Internal hash tables (constant factor + variable factor)
    Adaptive hash index 2807408 	(2213368 + 594040)
    Page hash           139112 (buffer pool 0 only)
    Dictionary cache    890448 	(554768 + 335680)
    File system         83536 	(82672 + 864)
    Lock system         337488 	(332872 + 4616)
    Recovery system     0 	(0 + 0)
Dictionary memory allocated 335680
Buffer pool size        8191
Buffer pool size, bytes 134201344
Free buffers            1
Database pages          8154
Old database pages      2989
Modified db pages       8
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 10918339, not young 0
0.04 youngs/s, 0.00 non-youngs/s
Pages read 10959892, created 91918, written 7785310
0.00 reads/s, 0.00 creates/s, 2.37 writes/s
Buffer pool hit rate 1000 / 1000, young-making rate 1 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 8154, unzip_LRU len: 0
I/O sum[121]:cur[0], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
2 read views open inside InnoDB
1 transactions active inside InnoDB
1 out of 1000 descriptors used
---OLDEST VIEW---
Normal read view
Read view low limit trx n:o 276C905
Read view up limit trx id 276C8FF
Read view low limit trx id 276C905
Read view individually stored trx ids:
Read view trx id 276C904
Read view trx id 276C903
Read view trx id 276C8FF
-----------------
Main thread process no. 1036, id 140284856301312, state: flushing log
Number of rows inserted 1575199, updated 3041895, deleted 0, read 2000002210
0.14 inserts/s, 0.61 updates/s, 0.00 deletes/s, 1.04 reads/s
------------------------
LATEST DETECTED DEADLOCK
------------------------
180925 15:46:22
*** (1) TRANSACTION:
TRANSACTION 265DBE7, ACTIVE 0 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 7 lock struct(s), heap size 1248, 5 row lock(s), undo log entries 3
MySQL thread id 436243, OS thread handle 0x7f965fcdc700, query id 63356859 m3-mgmt2.massive.org.au 172.16.200.79 slurmdb Updating
UPDATE "m3_assoc_table" SET rgt = rgt+2 WHERE rgt > 437 && deleted < 2
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 0 page no 723 n bits 256 index "PRIMARY" of table "slurm_acct_db"."m3_assoc_table" trx id 265DBE7 lock_mode X waiting
*** (2) TRANSACTION:
TRANSACTION 265DBE8, ACTIVE 0 sec fetching rows
mysql tables in use 1, locked 1
16 lock struct(s), heap size 3112, 1787 row lock(s), undo log entries 1560
MySQL thread id 436244, OS thread handle 0x7f96b8285700, query id 63356858 m3-mgmt2.massive.org.au 172.16.200.79 slurmdb Updating
UPDATE "m3_assoc_table" SET rgt = rgt+2 WHERE rgt > 437 && deleted < 2
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 0 page no 723 n bits 256 index "PRIMARY" of table "slurm_acct_db"."m3_assoc_table" trx id 265DBE8 lock_mode X
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 0 page no 1284 n bits 264 index "PRIMARY" of table "slurm_acct_db"."m3_assoc_table" trx id 265DBE8 lock_mode X waiting
*** WE ROLL BACK TRANSACTION (1)
------------
TRANSACTIONS
------------
Trx id counter 276E788
Purge done for trx's n:o < 276C905 undo n:o < 0
History list length 3391
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 0, not started
MySQL thread id 714203, OS thread handle 0x7f965e57a700, query id 100920110 localhost root
SHOW ENGINE INNODB STATUS
---TRANSACTION 276C8FF, not started
MySQL thread id 713991, OS thread handle 0x7f965fc01700, query id 100896405 m3-mgmt2.massive.org.au 172.16.200.79 slurmdb Waiting for table metadata lock
alter table "m3_assoc_table" AUTO_INCREMENT=0
---TRANSACTION 276E786, not started
MySQL thread id 683180, OS thread handle 0x7f96b8161700, query id 100920109 m3-mgmt2.massive.org.au 172.16.200.79 slurmdb
---TRANSACTION 275BB5E, not started
MySQL thread id 683178, OS thread handle 0x7f965fd6e700, query id 100915512 m3-mgmt2.massive.org.au 172.16.200.79 slurmdb
---TRANSACTION 276C904, ACTIVE 6434 sec
13 lock struct(s), heap size 3112, 7 row lock(s), undo log entries 3
MySQL thread id 713993, OS thread handle 0x7f965ff6d700, query id 100896484 m3-mgmt2.massive.org.au 172.16.200.79 slurmdb
Trx read view will not see trx with id >= 276C905, sees < 276C8FF
----------------------------
END OF INNODB MONITOR OUTPUT
============================
 |
+--------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.02 sec)



Stop slurmdbd:

MariaDB [(none)]> SHOW PROCESSLIST;
+--------+------+-----------+------+---------+------+-------+------------------+----------+
| Id     | User | Host      | db   | Command | Time | State | Info             | Progress |
+--------+------+-----------+------+---------+------+-------+------------------+----------+
| 714204 | root | localhost | NULL | Query   |    0 | NULL  | SHOW PROCESSLIST |    0.000 |
+--------+------+-----------+------+---------+------+-------+------------------+----------+
1 row in set (0.00 sec)


Start slurmdbd:

MariaDB [(none)]> SHOW PROCESSLIST;
+--------+---------+-------------------------------+---------------+---------+------+-------+------------------+----------+
| Id     | User    | Host                          | db            | Command | Time | State | Info             | Progress |
+--------+---------+-------------------------------+---------------+---------+------+-------+------------------+----------+
| 714204 | root    | localhost                     | NULL          | Query   |    0 | NULL  | SHOW PROCESSLIST |    0.000 |
| 714206 | slurmdb | m3-mgmt1.massive.org.au:41686 | slurm_acct_db | Sleep   |   18 |       | NULL             |    0.000 |
| 714208 | slurmdb | m3-mgmt1.massive.org.au:41698 | slurm_acct_db | Query   |    0 | NULL  | commit           |    0.000 |
+--------+---------+-------------------------------+---------------+---------+------+-------+------------------+----------+
3 rows in set (0.00 sec)


MariaDB [(none)]> SHOW PROCESSLIST;
+--------+---------+-------------------------------+---------------+---------+------+-------+------------------+----------+
| Id     | User    | Host                          | db            | Command | Time | State | Info             | Progress |
+--------+---------+-------------------------------+---------------+---------+------+-------+------------------+----------+
| 714204 | root    | localhost                     | NULL          | Query   |    0 | NULL  | SHOW PROCESSLIST |    0.000 |
| 714206 | slurmdb | m3-mgmt1.massive.org.au:41686 | slurm_acct_db | Sleep   |   44 |       | NULL             |    0.000 |
| 714210 | slurmdb | m3-mgmt2.massive.org.au:57842 | slurm_acct_db | Sleep   |   45 |       | NULL             |    0.000 |
| 714212 | slurmdb | m3-mgmt2.massive.org.au:57852 | slurm_acct_db | Sleep   |    4 |       | NULL             |    0.000 |
+--------+---------+-------------------------------+---------------+---------+------+-------+------------------+----------+
4 rows in set (0.00 sec)



Kindly advise, or let us if you need clarification.


Thanks.


Damien
Comment 1 Marshall Garey 2018-10-09 09:15:34 MDT
Damien, this sounds like a duplicate of bug 5400 (see bug 5400 comments 15,16,17), particularly since I see it's waiting to set AUTO_INCREMENT=0. This was originally reported on this post:

https://lists.schedmd.com/pipermail/slurm-users/2018-February/000569.html

You may apply this commit (it's only in 18.08, but should apply cleanly to 17.11) to fix it:

https://github.com/schedmd/slurm/commit/4a16541bf0e

Or you may use the workaround described by Jessica: set innodb_autoinc_lock_mode=2

Does this work for you?
Comment 2 Marshall Garey 2018-10-16 10:52:29 MDT
Have you had a chance to try a suggestion in comment 1?
Comment 3 Damien 2018-10-16 19:23:14 MDT
Hi Marshall

Firstly thanks for your replies.

We are leaning toward using "set innodb_autoinc_lock_mode=2"  but unable to find more details on this.

How this mechanism resolve locks ? 



Cheers

Damien
Comment 4 Marshall Garey 2018-10-17 09:38:29 MDT
The mysql documentation describes it very well. (I've linked mysql 5.7 - you'd want to look at whatever version of mysql or mariadb you're using.)

https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_autoinc_lock_mode

"The lock mode to use for generating auto-increment values."

https://dev.mysql.com/doc/refman/5.7/en/innodb-auto-increment-handling.html#innodb-auto-increment-lock-modes

Setting innodb_autoinc_lock_mode to 2 essentially just loosens up locks on auto increment. The document notes that auto increment values may not be consecutive in this mode - that is just fine.

Note that innodb_autoinc_lock_mode was introduced in mysql 5.1. If you're using anything older than that, you'll need to upgrade mysql or mariadb.

Note that you do not need this once you upgrade to 18.08, because of commit 4a16541bf0e.


Can you verify that setting innodb_autoinc_lock_mode to 2 solves the issue for you?
Comment 5 Marshall Garey 2018-10-24 11:43:50 MDT
Have you had a chance to test this?
Comment 6 Damien 2018-10-25 16:27:03 MDT
Hi Marshall

Thanks for your reply.

We will put "set innodb_autoinc_lock_mode=2" in mariaDB. So far, this problem has not reoccurred. 



Cheers

Damien
Comment 7 Marshall Garey 2018-10-25 16:28:03 MDT
Sounds good. I'm closing this ticket as infogiven. Please re-open it or file a new bug if you experience this issue again.