| Summary: | fatal: _mysql_query_internal: unable to resolve deadlock | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Marco Induni <marco.induni> |
| Component: | Database | Assignee: | Tim McMullan <mcmullan> |
| Status: | RESOLVED CANNOTREPRODUCE | QA Contact: | |
| Severity: | 3 - Medium Impact | ||
| Priority: | --- | ||
| Version: | 20.11.8 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | CSCS - Swiss National Supercomputing Centre | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | Target Release: | --- | |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
| Attachments: |
Output of command: show engine innodb status
Active node slurmdbd log Active node messages log Backup standby messages log Configuartion file slurmdbd show variables output |
||
|
Description
Marco Induni
2022-02-24 02:54:38 MST
Created attachment 23612 [details]
Output of command: show engine innodb status
Would you also attach the slurmdbd log from around the time of the fatal (assuming there is anything there) as well as your slurmdbd.conf (but please redact the database password)? Thanks! --Tim Created attachment 23627 [details]
Active node slurmdbd log
Created attachment 23629 [details]
Active node messages log
Created attachment 23630 [details]
Backup standby messages log
Hi Tim, as requested attached you will log and configuration. Kind regards. Marco Created attachment 23631 [details]
Configuartion file slurmdbd
Thank you for the additional logging. Unfortunately what we want to look for in the show engine innodb status output isn't present so its less definitive what is going on here. Would you also mind attaching the output of "SHOW VARIABLES;" from the database? Thanks! --Tim Created attachment 23635 [details]
show variables output
Hi Tim,
attached the out of:
mysql --table -e "show variables" > mysql-show-variables.log
Bests regards,
Marco
Thanks for this output Marco, I've been looking through the logs etc and its still not conclusive what caused the hangup, but it looks like around that time archive/purge was running and it seems to be fairly slow to run. Those operations can hold locks for a while and *might* be related. There are some improvements coming in 22.05 that can help speed these up. The first thing I would do here is try increasing the deadlock detection timer, its set at the default 50 seconds. Would you be able to change it to 100 seconds? Its in microseconds, so the config line would be something like deadlock_timeout_long=100000000 Thanks! --Tim Dear Tim, as agreed I've updated the deadlock timeout to deadlock_timeout_long=100000000 Since the event happened just once, I think we can close this ticket for the moment and I will reopen it or create a new one in case the same problem will hit the system another time. Thank you for the support and all the best. Marco Induni (In reply to Marco Induni from comment #12) > Dear Tim, > > as agreed I've updated the deadlock timeout > to deadlock_timeout_long=100000000 > > Since the event happened just once, I think we can close this ticket for the > moment and I will reopen it or create a new one in case the same problem > will hit the system another time. > > > Thank you for the support and all the best. > > Marco Induni Thanks for the update Marco, please let us know if it happens again! --Tim |