Ticket 3532

Summary:	slurmdbd and systemd and database conversions make sad administrators
Product:	Slurm	Reporter:	Doug Jacobsen <dmjacobsen>
Component:	slurmdbd	Assignee:	Unassigned Developer <dev-unassigned>
Status:	RESOLVED DUPLICATE	QA Contact:
Severity:	5 - Enhancement
Priority:	---	CC:	sts
Version:	17.02.1
Hardware:	Cray XC
OS:	Linux
Site:	NERSC	Slinky Site:	---
Alineos Sites:	---	Atos/Eviden Sites:	---
Confidential Site:	---	Coreweave sites:	---
Cray Sites:	---	DS9 clusters:	---
Google sites:	---	HPCnow Sites:	---
HPE Sites:	---	IBM Sites:	---
NOAA SIte:	---	NoveTech Sites:	---
Nvidia HWinf-CS Sites:	---	OCF Sites:	---
Recursion Pharma Sites:	---	SFW Sites:	---
SNIC sites:	---	Tzag Elita Sites:	---
Linux Distro:	---	Machine Name:
CLE Version:		Version Fixed:
Target Release:	---	DevPrio:	---
Emory-Cloud Sites:	---

Description Doug Jacobsen 2017-03-03 11:02:48 MST

Hello,

I made a backup copy of my database.  Installed slurmdbd 17.0.2.1.  And ran:

service slurmdbd start

And started tailing the slurmdbd log in another window (because "service slurmdbd start" was blocking).

After about 2 minutes systemd got anxious, decided the service had timed out (since slurmdbd didn't daemonize for the database conversion process), and then sigterm'd slurmdbd.  This left our job table half converted, which then led to conversion errors later.

I restored the database and instead just ran /usr/sbin/slurmdbd (as I had during our practice run of this earlier this week).  And that seems to be running fine.

I'd suggest that it might be better for slurmdbd to daemonize more quickly so that systemd isn't tempted to time it out and sigterm it.

Thanks,
-Doug

Comment 1 Danny Auble 2017-03-03 11:11:18 MST

Hey Doug, I can understand your frustration.  Perhaps I can look into making the conversion process it's own thread to allow systemd to be happy.  In the mean time perhaps we can update the quickstart guide to make this note.  As this only happens once per major release hopefully it isn't that painful.

I will make note if we do spin this off on it's own thread and if the process has issue the slurmdbd will just die, but systemd will be happy I suppose :).

I would have expected the conversion is done inside a transaction so it could happen many times and no ill will come from it if it doesn't finish, just wasted cycles.  You note though this wasn't the case for you.  Could you elaborate on the "errors later" you had?

Comment 2 Doug Jacobsen 2017-03-03 11:30:54 MST

Hi Danny,

Following the failed conversion (by systemd death), I ran it manually:

corique01:/var/tmp/slurm # /usr/sbin/slurmdbd -Dvvv
slurmdbd: debug:  Munge authentication plugin loaded
slurmdbd: debug2: mysql_connect() called for db cori_slurm_acct_db
slurmdbd: debug2: It appears the table conversions have already taken
place, hooray!
slurmdbd: adding column admin_comment after account in table
"cori_job_table"
slurmdbd: debug:  Table "cori_job_table" has changed.  Updating...
slurmdbd: error: mysql_query failed: 1060 Duplicate column name
'admin_comment'
alter table "cori_job_table" modify `job_db_inx` bigint unsigned not null
auto_increment, modify `mod_time` bigint unsigned default 0 not null,
modify `deleted` tinyint default 0 not null, modify `account` tinytext, add
`admin_comment` text after account, modify `array_task_str` text, modify
`array_max_tasks` int unsigned default 0 not null, modify
`array_task_pending` int unsigned default 0 not null, modify `cpus_req` int
unsigned not null, modify `derived_ec` int unsigned default 0 not null,
modify `derived_es` text, modify `exit_code` int unsigned default 0 not
null, modify `job_name` tinytext not null, modify `id_assoc` int unsigned
not null, modify `id_array_job` int unsigned default 0 not null, modify
`id_array_task` int unsigned default 0xfffffffe not null, modify `id_block`
tinytext, modify `id_job` int unsigned not null, modify `id_qos` int
unsigned default 0 not null, modify `id_resv` int unsigned not null, modify
`id_wckey` int unsigned not null, modify `id_user` int unsigned not null,
modify `id_group` int unsigned not null, modify `kill_requid` int default
-1 not null, modify `mem_req` bigint unsigned default 0 not null, modify
`nodelist` text, modify `nodes_alloc` int unsigned not null, modify
`node_inx` text, modify `partition` tinytext not null, modify `priority`
int unsigned not null, modify `state` int unsigned not null, modify
`timelimit` int unsigned default 0 not null, modify `time_submit` bigint
unsigned default 0 not null, modify `time_eligible` bigint unsigned default
0 not null, modify `time_start` bigint unsigned default 0 not null, modify
`time_end` bigint unsigned default 0 not null, modify `time_suspended`
bigint unsigned default 0 not null, modify `gres_req` text not null default
'', modify `gres_alloc` text not null default '', modify `gres_used` text
not null default '', modify `wckey` tinytext not null default '', modify
`track_steps` tinyint not null, modify `tres_alloc` text not null default
'', modify `tres_req` text not null default '', drop primary key, add
primary key (job_db_inx), drop index id_job, add unique index (id_job,
id_assoc, time_submit), drop key rollup, add key rollup (time_eligible,
time_end), drop key rollup2, add key rollup2 (time_end, time_eligible),
drop key nodes_alloc, add key nodes_alloc (nodes_alloc), drop key wckey,
add key wckey (id_wckey), drop key qos, add key qos (id_qos), drop key
association, add key association (id_assoc), drop key array_job, add key
array_job (id_array_job), drop key reserv, add key reserv (id_resv), drop
key sacct_def, add key sacct_def (id_user, time_start, time_end), drop key
sacct_def2, add key sacct_def2 (id_user, time_end, time_eligible);
slurmdbd: Accounting storage MYSQL plugin failed
slurmdbd: error: Couldn't load specified plugin name for
accounting_storage/mysql: Plugin init() callback failed
slurmdbd: error: cannot create accounting_storage context for
accounting_storage/mysql
slurmdbd: fatal: Unable to initialize accounting_storage/mysql accounting
storage plugin
Aborted (core dumped)
corique01:/var/tmp/slurm #

I have that core dump and once I get a bit more free can possibly send data
from it.

I restored the database from backup and then slurmdbd (manually run) worked
fine.

-Doug

----
Doug Jacobsen, Ph.D.
NERSC Computer Systems Engineer
National Energy Research Scientific Computing Center <http://www.nersc.gov>
dmjacobsen@lbl.gov

------------- __o
---------- _ '\<,_
----------(_)/  (_)__________________________


On Fri, Mar 3, 2017 at 10:11 AM, <bugs@schedmd.com> wrote:

> Danny Auble <da@schedmd.com> changed bug 3532
> <https://bugs.schedmd.com/show_bug.cgi?id=3532>
> What Removed Added
> Severity 4 - Minor Issue 5 - Enhancement
> Assignee support@schedmd.com dev-unassigned@schedmd.com
>
> *Comment # 1 <https://bugs.schedmd.com/show_bug.cgi?id=3532#c1> on bug
> 3532 <https://bugs.schedmd.com/show_bug.cgi?id=3532> from Danny Auble
> <da@schedmd.com> *
>
> Hey Doug, I can understand your frustration.  Perhaps I can look into making
> the conversion process it's own thread to allow systemd to be happy.  In the
> mean time perhaps we can update the quickstart guide to make this note.  As
> this only happens once per major release hopefully it isn't that painful.
>
> I will make note if we do spin this off on it's own thread and if the process
> has issue the slurmdbd will just die, but systemd will be happy I suppose :).
>
> I would have expected the conversion is done inside a transaction so it could
> happen many times and no ill will come from it if it doesn't finish, just
> wasted cycles.  You note though this wasn't the case for you.  Could you
> elaborate on the "errors later" you had?
>
> ------------------------------
> You are receiving this mail because:
>
>    - You reported the bug.
>
>

Comment 3 Danny Auble 2017-03-03 12:00:09 MST

Yeah, I can see how that happened outside a transaction.  Good thing you had a backup :), though we could had probably worked around it if you hadn't.  I'll see what we can do.

The core is likely from the change Tim gave you from making fatals abort() instead of exit(1).  So you can just throw that one away ;).  Good to know the patch works as expected.

Comment 4 Doug Jacobsen 2018-01-08 23:59:24 MST

perhaps the timeout logic could just be disabled in the slurmdbd service file:

from systemd.service:

"""
       TimeoutStartSec=
           Configures the time to wait for start-up. If a daemon service does not signal start-up completion within the configured time, the service will be
           considered failed and will be shut down again. Takes a unit-less value in seconds, or a time span value such as "5min 20s". Pass "0" to disable the
           timeout logic. Defaults to DefaultTimeoutStartSec= from the manager configuration file, except when Type=oneshot is used, in which case the timeout is
           disabled by default (see systemd-system.conf(5)).
"""

Comment 5 Doug Jacobsen 2018-01-09 05:58:09 MST

or possibly allow slurmdbd to daemonize and fully startup from systemd's perspective, and then start the conversion.  I guess that would break with the current convention.  But if there are potentially long running tasks that, if terminated early might cause the database to be unusable on future boots, it seems like allowing a mechanism for systemd to automatically terminate slurmdbd is probably a bad thing.

Comment 6 Tim Wickberg 2019-10-25 16:00:15 MDT

(In reply to Doug Jacobsen from comment #5)
> or possibly allow slurmdbd to daemonize and fully startup from systemd's
> perspective, and then start the conversion.  I guess that would break with
> the current convention.  But if there are potentially long running tasks
> that, if terminated early might cause the database to be unusable on future
> boots, it seems like allowing a mechanism for systemd to automatically
> terminate slurmdbd is probably a bad thing.

This was done ahead of the 18.08 release. Marking closed as a duplicate of an internal ticket that tracked that change.

*** This ticket has been marked as a duplicate of ticket 5247 ***