Ticket 16424

Summary:	slurmdbd stuck with large SQL insert
Product:	Slurm	Reporter:	Miguel Esteva <esteva.m>
Component:	slurmdbd	Assignee:	Tim McMullan <mcmullan>
Status:	RESOLVED FIXED	QA Contact:
Severity:	4 - Minor Issue
Priority:	---	CC:	hinton, marshall, mcmullan
Version:	22.05.7
Hardware:	Linux
OS:	Linux
See Also:	https://bugs.schedmd.com/show_bug.cgi?id=13864 https://bugs.schedmd.com/show_bug.cgi?id=13792
Site:	WEHI	Slinky Site:	---
Alineos Sites:	---	Atos/Eviden Sites:	---
Confidential Site:	---	Coreweave sites:	---
Cray Sites:	---	DS9 clusters:	---
Google sites:	---	HPCnow Sites:	---
HPE Sites:	---	IBM Sites:	---
NOAA SIte:	---	NoveTech Sites:	---
Nvidia HWinf-CS Sites:	---	OCF Sites:	---
Recursion Pharma Sites:	---	SFW Sites:	---
SNIC sites:	---	Tzag Elita Sites:	---
Linux Distro:	CentOS	Machine Name:
CLE Version:		Version Fixed:	23.02.2 23.11.0rc1
Target Release:	---	DevPrio:	---
Emory-Cloud Sites:	---
Attachments:	extract from slurmdbd.log bug16424 tmp fix mysql vars

Description Miguel Esteva 2023-04-02 23:04:38 MDT

Created attachment 29636 [details]
extract from slurmdbd.log

Hi!

Unfortunately one of our users submitted a massive sun line that is causing issues with slurmdbd. By the looks, slurmctld is repeatedly trying to insert it into the mysql db and failing. 

It just errors with:

[2023-04-03T03:45:03.250] error: mysql_query failed: 2006 MySQL server has gone away
insert into "milton_step_table" (job_db_inx, id_step, step_het_comp, time_start, step_name, state, tres_alloc, nodes_alloc, task_cnt, nodelist, node_inx, task_dist, req_cpufreq, req_cpufreq_min, req_cpufreq_gov, submit_line) values (13692235, 0, 4294967294, 1680418048, 'paste', 1, '1=2,2=16384,4=1', 1, 1, 'sml-n16', '65', 1, 4294967294, 4294967294, 4294967294, 'srun paste -d "&" /

... Skipping massive output visible in the logs ...

Meanwhile DBD Agent queue size just keeps growing

DBD Agent queue size: 92999 

Any non-disruptive approach to deal with this?

Cheers,

Miguel

Comment 4 Tim McMullan 2023-04-03 09:32:47 MDT

Hi Miguel,

I've attached a patch here that should truncate the submit line.  I think the problem here is coming from the database field only supporting 64KiB of data, but it looks like its trying to insert much closer to 1MiB.

The attached patch will require you to rebuild/install slurm on the slurmdbd node and restart the slrumdbd.  It it should truncate any job messages with a submit line over that 64KiB size to actually fit in the field before inserting it.

Its unlikely that this will be the final form of the fix, but since the message already exists and its unknown how many there are in the slurmctld already I think this is the way to go to get your slurmdbd rolling again.

Please update us after the patch is applied to let us know if the dbd agent queue starts getting processed.

Thanks!
--Tim

Comment 7 Miguel Esteva 2023-04-03 14:01:34 MDT

Hi Tim,

Thanks for the response.

Applied the patch and rebuilt.

Worked perfectly:

[2023-04-04T05:56:08.498] error: submit_line for step too long, truncating.
[2023-04-04T05:56:08.532] error: submit_line for step too long, truncating.
[2023-04-04T05:56:08.543] error: submit_line for step too long, truncating.
[2023-04-04T05:56:08.547] error: submit_line for step too long, truncating.
[2023-04-04T05:56:08.552] error: submit_line for step too long, truncating.
[2023-04-04T05:56:08.556] error: submit_line for step too long, truncating.
[2023-04-04T05:56:08.568] error: submit_line for step too long, truncating.
[2023-04-04T05:56:08.573] error: submit_line for step too long, truncating.
[2023-04-04T05:56:08.578] error: submit_line for step too long, truncating.
[2023-04-04T05:56:08.607] error: submit_line for step too long, truncating.

DBD queue size back to 0.

Many thanks and cheers,

Miguel

Comment 8 Tim McMullan 2023-04-04 06:42:41 MDT

Hi Miguel,

I'm very glad to hear that got things going again!

I'd like to ask you if this massive srun line was intentional or not?  It seems this problem has come up before and I'm trying to land on a more permanent solution to prevent it from happening again.

This solution works, but I don't know that its the ideal one for using going forward.

I re-opened and reduced the severity since we are just exploring fixes now that things are running correctly.

Thanks!
--Tim

Comment 9 Miguel Esteva 2023-04-04 20:23:40 MDT

Hi Tim,

The srun line was unintentional. We confirmed with the user. First time we have seen this happen. This was the exact situation:

User had a file full of 1000, 1000-character commands. sed was used to pull out one line from that file for each array job, but then accidentally submitted the script as a non-array job. This caused sed to pull out the entire 1,000,000 character file into a bash variable, and then passed that into srun as an argument.

We do not anticipate other users trying to intentionally use such long srun commands. 

Perhaps an optional solution could be having a config option that if enabled can set slurmctld to reject the submission if the submit line is way too long (can't recall if the lua plugin exposes the submit line in job_desc, could also be checked there)? Would make it easier to debug for some admins, but might be a bit harsh if users are running something legitimate. The current patch you kindly provided seems to be a good solution on our end since this was an edge case. 

Please let us know if we can help in any way or if you require more information.

Cheers,

Miguel

Comment 10 Tim McMullan 2023-04-05 12:00:20 MDT

Thanks Miguel!

Its helpful to know it was unintentional!  It is likely that we will add a parameter to give more flexibility in a future version, but for now as a fix I'm looking at rejecting the submission if this hard limit is passed.

At the moment I also intend to leave this ticket open until we have a fix for this to keep track of its progress.  Let me know if thats alright with you!

Thanks!
--Tim

Comment 13 Miguel Esteva 2023-04-05 17:00:32 MDT

Thanks Tim.

No problem, we can keep this one open. 

Can the patch you provided be applied to 23.03.1?

Kind regards,

Miguel

Comment 14 Tim McMullan 2023-04-06 07:04:08 MDT

(In reply to Miguel Esteva from comment #13)
> Thanks Tim.
> 
> No problem, we can keep this one open

Thank you!

> Can the patch you provided be applied to 23.03.1?

As it sits, it would need modification to apply to 23.02.1.  I have mock ups of patches for a couple variations on a fix for 23.02, if you are able to hold off for a little while while I get a more fleshed-out patch ready I can point you more toward our eventual solution.

Thanks!
--Tim

Comment 15 Tim McMullan 2023-04-06 07:14:07 MDT

Hi Miguel!

Sorry for the double e-mail here, but I was also wondering what version of MySQL/MariaDB you are currently running, as well as the output of "show variables;" from the database?

In my work on the 23.02 patch I attempted to reproduce the problem and the database appears to have truncated the field for me.  Clearly thats not how every db is working, but I'd like to see if I can get it to behave similarly for me.

Thanks!
--Tim

Comment 19 Miguel Esteva 2023-04-06 17:19:52 MDT

Created attachment 29751 [details]
mysql vars

Hi Tim,

Thank you for the replies.

We are on the process of updating to 23.02.1 this weekend. We will compile with no modifications for now. We do not expect any users to run any similar commands.

Currently slurmdbd is already on 23.02.1. The server is CentOS 7.9.2009 with MariaDB 5.5.68-1. The variables are included as an attachment.

Thank you very much,

Miguel

Comment 20 Tim McMullan 2023-04-06 18:20:54 MDT

Thank you for this Miguel!

(In reply to Miguel Esteva from comment #19) 
> We are on the process of updating to 23.02.1 this weekend. We will compile
> with no modifications for now. We do not expect any users to run any similar
> commands.

Sounds good!

> Currently slurmdbd is already on 23.02.1. The server is CentOS 7.9.2009 with
> MariaDB 5.5.68-1. The variables are included as an attachment.

I want to put this information out there for you, but please take it with a gran of salt as I haven't yet played with this at all on the older MariaDB versions (I'm currently running 10.11.2 in my dev environment).

The "max_allowed_packet" variable is an important one to note.  Its currently set to 1MiB, which the query we were sending was larger than.  In later MariaDB versions, the default was changed (briefly) to 4MiB, and has been sitting at 16MiB since MariaDB 10.2.4.  Increasing this value will allow queries that might have failed before to succeed now.  It only matters in the context of large fields.  My best example of this would be AccountingStoreFlags=job_script or job_env.  Running a 1MiB+ job script with the job_script option would result in the same kind of trouble that had you open this ticket.  The database schema can store larger than we allow a job script to be, but the query would still fail because of the "max_allowed_packet" limit.

"innodb_strict_mode" may also have an interesting role here.  Its set to off currently, but if enabled part of what it does (at least in recent versions) is automatically truncate fields when attempting to insert too much data.

Either may be worth looking into, particularly if you want to store potentially large things in the database going forward.  Both of these are things I'm looking at as part of whatever patch finally makes it into future slurm versions :)

Thanks!
--Tim

Comment 21 Miguel Esteva 2023-04-06 19:37:09 MDT

Hi Tim,

Thanks for the quick response and thanks for the information. Our MariaDB config is mostly default so probably updating to version 10+ is a good idea, especially since we are ditching CentOS 7 in the near future. I just recall some bugs about migrating from v5 to v10 before like: 

https://bugs.schedmd.com/show_bug.cgi?id=13562#c21

That said, just updated MariaDB to v10.11.2 on our test cluster and it went with no issues. Doing the same in the prod cluster will probably be enough to cover us for now having to modify Slurm 23.02.1.

Thanks again,

Miguel

Comment 22 Tim McMullan 2023-04-07 06:04:33 MDT

(In reply to Miguel Esteva from comment #21)
> Hi Tim,
> 
> Thanks for the quick response and thanks for the information. Our MariaDB
> config is mostly default so probably updating to version 10+ is a good idea,
> especially since we are ditching CentOS 7 in the near future. I just recall
> some bugs about migrating from v5 to v10 before like: 
> 
> https://bugs.schedmd.com/show_bug.cgi?id=13562#c21
>
> That said, just updated MariaDB to v10.11.2 on our test cluster and it went
> with no issues. Doing the same in the prod cluster will probably be enough
> to cover us for now having to modify Slurm 23.02.1.

Very true, we did end up patching that in 22.05.7 and it should fix things if you change mariadb versions.  Upgrading the database if you are willing is great, there have been a lot of improvements in mariadb as well since 5.5!

Thanks!
--Tim

Comment 23 Miguel Esteva 2023-04-07 06:13:53 MDT

Hi Tim,

From what I have seen playing around with some jobs and the rate limiter in our test cluster, the migration to MariaDB 10 seem to have gone well.

I was just a bit concerned about the statement in the documentation:

"NOTE: If you have an existing Slurm accounting database and plan to upgrade your database server to MariaDB 10.2.1 (or newer) from a pre-10.2.1 version or from any version of MySQL, please contact SchedMD for assistance."

All I did was update MariaDB to 10.11.2 (as per their instructions) and recompile Slurm. Are there any additional steps that have to be taken (i.e. update MariaDB  libraries in nodes/submit hosts, etc)?

That seems like a good potential solution to keep us covered from large database entries in production for the meantime.

Thank you so much for the help so far!

Cheers,

Miguel

Comment 24 Tim McMullan 2023-04-07 06:24:32 MDT

(In reply to Miguel Esteva from comment #23)
> Hi Tim,
> 
> From what I have seen playing around with some jobs and the rate limiter in
> our test cluster, the migration to MariaDB 10 seem to have gone well.
> 
> I was just a bit concerned about the statement in the documentation:
> 
> "NOTE: If you have an existing Slurm accounting database and plan to upgrade
> your database server to MariaDB 10.2.1 (or newer) from a pre-10.2.1 version
> or from any version of MySQL, please contact SchedMD for assistance."

We added that note to the docs before the final fix had been worked out as a stop gap.  There were a couple possible ways to get tripped up by the issue and wanted to have at least some warning that you might run into a problem.  I believe there is a ticket to remove/edit that note in a future version since the problem has a proper fix now.

> All I did was update MariaDB to 10.11.2 (as per their instructions) and
> recompile Slurm. Are there any additional steps that have to be taken (i.e.
> update MariaDB  libraries in nodes/submit hosts, etc)?

Nope, as long as you are on 22.05.7+ at the time of the mariadb upgrade, we don't expect a problem anymore.

> That seems like a good potential solution to keep us covered from large
> database entries in production for the meantime.

Yes, I think it will as well, but I'll let you know if I find anything that would suggest otherwise.

> Thank you so much for the help so far!

Thanks for working with me while I've been investigating this!

--Tim

Comment 25 Miguel Esteva 2023-04-07 07:04:29 MDT

Hi Tim,

Thanks for the confirmation!

(In reply to Tim McMullan from comment #24)

> Nope, as long as you are on 22.05.7+ at the time of the mariadb upgrade, we
> don't expect a problem anymore.

Guess I should double check since I just pushed the updates to all the cluster:

The only necessary step would be to update MariaDB to 10.11.2 on the server running slurmdbd?

Does Slurm 23.02.1 has to be recompiled at all after the update? Or only has to be recompiled for the server running slurmdbd?

Thank you!

Miguel

Comment 26 Tim McMullan 2023-04-07 07:48:31 MDT

My general expectation is that the older 5.5 client should still be able to work with a 10.11 server, though ultimately I think it would be best to end up on the same version on both ends.

There are only 2 plugins that need sql to work - accounting_storage/mysql and jobcomp/mysql.  If you are not using jobcomp/mysql, I would really only expect the slurmdbd to need it and would definitely suggest recompiling with the new mariadb client installed.

If you do use the jobcomp/mysql plugin and did recompile, you would need to push the new mariadb client out anywhere that sacct might be run, since the library name changed from libmysql to libmariadb.

I hope this helps!
--Tim

Comment 27 Miguel Esteva 2023-04-07 08:02:59 MDT

Hi Tim,

Definitively helps a lot and it is spot on (from our dev submit host):

ldd jobcomp_mysql.so
	linux-vdso.so.1 =>  (0x00007ffddef41000)
	libmariadb.so.3 => not found
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00002adf403f0000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00002adf4060c000)
	librt.so.1 => /lib64/librt.so.1 (0x00002adf40810000)
	libm.so.6 => /lib64/libm.so.6 (0x00002adf40a18000)
	libresolv.so.2 => /lib64/libresolv.so.2 (0x00002adf40d1a000)
	libc.so.6 => /lib64/libc.so.6 (0x00002adf40f34000)
	/lib64/ld-linux-x86-64.so.2 (0x00002adf3ffc3000)

Easily fixed:

curl -sS https://downloads.mariadb.com/MariaDB/mariadb_repo_setup |bash
yum -y install MariaDB-shared

Thanks for the help and cheers,

Miguel

Comment 40 Tim McMullan 2023-04-24 15:11:14 MDT

Thanks Miguel for your patience and letting me work this issue through this ticket!

To update you on what we've changed:

For 23.02.2 there will be a warning if max_allowed_packet is set below 16MB (the current default for mariadb).  I found while looking through all the information, we are actually running in a mode that should automatically truncate the oversize strings, but the max_allowed_packet limit of the older mariadb was likely the cause of the problem.

This is done with these commits:
https://github.com/SchedMD/slurm/commit/f1a44ddb43
https://github.com/SchedMD/slurm/commit/dc8f9f760e

For 23.11, we are changing the field in the database so it can store much longer submit lines, and are also adding an option max_submit_line_size so you can reject jobs if the line is longer than you feel is reasonable.

Thanks again!  Since you have long since resolved the issue I'm going to resolve this ticket as fixed, but if you have any other questions about it just let me know!

Thanks!
--Tim

Comment 41 Marshall Garey 2023-04-24 16:36:07 MDT

*** Ticket 13864 has been marked as a duplicate of this ticket. ***

Comment 42 Miguel Esteva 2023-04-24 20:41:45 MDT

Hi Tim,

Thank you for following up and for all the help.

We updated to MariaDB v10.11.2 when we updated Slurm to v23.02.1 so we do not expect to see the issue again. Will keep an eye in case other large submit lines come along. That will probably be useful info to adjust max_submit_line_size once we start testing 23.11.

Again, thanks for the help, we appreciate it a lot.

Cheers,

Miguel