Ticket 3131

Summary:	Upgrading the Slurm system from 14.11.6 to 15.08.12
Product:	Slurm	Reporter:	Hjalti Sveinsson <hjalti.sveinsson>
Component:	Other	Assignee:	Danny Auble <da>
Status:	RESOLVED INFOGIVEN	QA Contact:
Severity:	4 - Minor Issue
Priority:	---	CC:	kolbeinn.josepsson
Version:	16.05.5
Hardware:	Linux
OS:	Linux
Site:	deCODE	Slinky Site:	---
Alineos Sites:	---	Atos/Eviden Sites:	---
Confidential Site:	---	Coreweave sites:	---
Cray Sites:	---	DS9 clusters:	---
Google sites:	---	HPCnow Sites:	---
HPE Sites:	---	IBM Sites:	---
NOAA SIte:	---	NoveTech Sites:	---
Nvidia HWinf-CS Sites:	---	OCF Sites:	---
Recursion Pharma Sites:	---	SFW Sites:	---
SNIC sites:	---	Tzag Elita Sites:	---
Linux Distro:	---	Machine Name:
CLE Version:		Version Fixed:
Target Release:	---	DevPrio:	---
Emory-Cloud Sites:	---
Attachments:	slurmdbd log file

Description Hjalti Sveinsson 2016-09-29 08:28:55 MDT

Hi,

we are planning on upgrading from 14.11.6 to 15.08.12 next monday and upgrading the OS on the machine from RHEL6 to RHEL7. 

We want to add SlurmDBD aswell during the upgrade so I have made an actionplan that I wanted to ask if it was correct.

Here are the actions I am planning to take.

1.	Create rpm packages from slurm-15.08.12 source tarball on RHEL7 machine. (make sure to include all development libraries).
2.	Shut down slurm services lhpc-head (HEAD NODE)
3.	Recursively copy /etc/slurm/* to an NFS mount
4.	Copy /etc/munge/munge.key to an NFS mount
5.	Backup mysql slurm_acc database to an NFS mount using mysqldump
6.	Recursively copy /var/lib/slurm/* to an NFS mount
7.	Recursively copy /var/log/slurm/* to an NFS mount
8.	Deploy RHEL7 template on Red Hat Satellite server for lhpc-head server
9.	Change lhpc-head network to the PXE network
10.	Shutdown lhpc-head
11.	Move lhpc-head from chassis 18 to chassis 3
12.	Power on lhpc-head and let the RHEL7 OS installation install on the server
13.	Install the slurm-15.08.12 rpm packages to lhpc-head after OS installation completes (slurm, slurm-plugins, slurm-munge, slurm-slurmdbd, slurm-sql, slurm-perlapi, slurm-sjstat)
14.	Recursively copy /etc/slurm/* from the NFS mountpoint to lhpc-head:/etc/slurm/
15.	Copy /etc/munge/munge.key from the NFS mountpoint to lhpc-head:/etc/munge/munge.key
16.	Create the group slurm á id 598 
17.	Create the user slurm á id 598
18.	Create directories /var/{lib,log}/slurm/
19.	Change the owner:group on these directories to slurm:slurm
20.	Recursively copy /var/lib/slurm/* from the NFS mountpoint to lhpc-head:/var/lib/slurm/
21.	Recursively copy /var/log/slurm/* from the NFS mountpoint to lhpc-head:/var/log/slurm/
22.	Restore the slurm_acc database into the lhpc-head server fromt the NFS mountpoint
23.	Change slurm.conf so it uses SlurmDBD service?
24.	Start the slurm services
25.	Check if everything works

If something is not correct here, can you please provide me with information on what I need to change in my action plan.

regards,
Hjalti Sveinsson

Comment 1 Tim Wickberg 2016-09-29 08:47:11 MDT

Looks generally correct.

If possible, I'd suggest getting your slurmdbd server up and running ahead of time. You'll also need to add the cluster to the slurmdbd with sacctmgr before you start slurmctld.

I'd also strongly encourage you to look at using 16.05 rather than 15.08; the 16.05.5 release is due out this afternoon and includes some important changes with respect to how cgroups behave with RHEL7.

Mainly, you no longer will need the ReleaseAgent setting; systemd has occasionally removed slurmd's release_agent mount option which can then leaves a lot of stray entries under the cpuset and device cgroup controllers, and this is resolved with some patches that are in 16.05.5 and above.

Comment 2 Hjalti Sveinsson 2016-09-29 09:55:33 MDT

Thank you for your prompt respone. 

If I do use the 16.05.5 won't I lose jobs or other state information since I am upgrading from 14.11.6?

I will see if I can find a new hardware that I can install the OS on and install slurmdbd in beforehand. 

How do I add the cluster to the slurmdbd with sacctmgr? Can I find some documentation regarding that?

The slurmdbd uses the same slurm database as slurmctld right? So I would then restore my current slurm database into the new server and start slurmdbd on that machine and the add the cluster with sacctmgr?

regards,
Hjalti Sveinsson

Comment 3 Tim Wickberg 2016-09-29 10:21:27 MDT

(In reply to Hjalti Sveinsson from comment #2)
> Thank you for your prompt respone. 
> 
> If I do use the 16.05.5 won't I lose jobs or other state information since I
> am upgrading from 14.11.6?

No. Each release can be upgraded to from the two prior versions, so 14.11 is fine. (Releases were 14.11, 15.08, 16.05.)

> I will see if I can find a new hardware that I can install the OS on and
> install slurmdbd in beforehand. 
> 
> How do I add the cluster to the slurmdbd with sacctmgr? Can I find some
> documentation regarding that?

'sacctmgr add cluster foo'

Please see the 'Database Configuration' section of http://slurm.schedmd.com/accounting.html ; I'd suggest reviewing that before your upgrade.

> The slurmdbd uses the same slurm database as slurmctld right? So I would
> then restore my current slurm database into the new server and start
> slurmdbd on that machine and the add the cluster with sacctmgr?

I will need to check on this, I have not done that conversion myself.

Comment 4 Hjalti Sveinsson 2016-09-29 10:37:08 MDT

Okay, thank you.

Please let me know when you have the information regarding the database for SlurmDBD, if it can use the existing Slurm database we have or if we need to create a new one.

If we need to create a new database, we will use that one for all Slurm accounting information and stop using the old one?

kveðja/regards
Hjalti
_____________________________
Hjalti Þór Sveinsson
deCODE genetics ehf
System Administrator - IT Operations
hjalti.sveinsson@decode.is<mailto:hjalti.sveinsson@decode.is>
tel: +(354) 570-1892
www.decode.com<http://www.decode.com/>
_____________________________

From: bugs@schedmd.com [mailto:bugs@schedmd.com]
Sent: 29. september 2016 16:21
To: Hjalti Þ. Sveinsson <hjalti.sveinsson@decode.is>
Subject: [Bug 3131] Upgrading the Slurm system from 14.11.6 to 15.08.12

Comment # 3<https://bugs.schedmd.com/show_bug.cgi?id=3131#c3> on bug 3131<https://bugs.schedmd.com/show_bug.cgi?id=3131> from Tim Wickberg<mailto:tim@schedmd.com>

(In reply to Hjalti Sveinsson from comment #2<show_bug.cgi?id=3131#c2>)

> Thank you for your prompt respone.

>

> If I do use the 16.05.5 won't I lose jobs or other state information since I

> am upgrading from 14.11.6?



No. Each release can be upgraded to from the two prior versions, so 14.11 is

fine. (Releases were 14.11, 15.08, 16.05.)



> I will see if I can find a new hardware that I can install the OS on and

> install slurmdbd in beforehand.

>

> How do I add the cluster to the slurmdbd with sacctmgr? Can I find some

> documentation regarding that?



'sacctmgr add cluster foo'



Please see the 'Database Configuration' section of

http://slurm.schedmd.com/accounting.html ; I'd suggest reviewing that before

your upgrade.



> The slurmdbd uses the same slurm database as slurmctld right? So I would

> then restore my current slurm database into the new server and start

> slurmdbd on that machine and the add the cluster with sacctmgr?



I will need to check on this, I have not done that conversion myself.

________________________________
You are receiving this mail because:

  *   You reported the bug.

Comment 6 Hjalti Sveinsson 2016-09-30 08:36:42 MDT

Did you check if the DB conversion was possible, going from AccountingStorageType=accounting_storage/mysql to AccountingStorageType=accounting_storage/slurmdbd and using the existing mysql database?

Will that work are are we going to loose everything and start from scratch if go over to SlurmDBD?

regards,
Hjalti Sveinsson

Comment 7 Tim Wickberg 2016-09-30 09:32:18 MDT

(In reply to Hjalti Sveinsson from comment #6)
> Did you check if the DB conversion was possible, going from
> AccountingStorageType=accounting_storage/mysql to
> AccountingStorageType=accounting_storage/slurmdbd and using the existing
> mysql database?
> 
> Will that work are are we going to loose everything and start from scratch
> if go over to SlurmDBD?

Yes, your existing data should come across without problems. Sorry for the delay, I just wanted to confirm this before updating you.

Is there anything else I can address?

Comment 8 Hjalti Sveinsson 2016-10-12 04:29:45 MDT

Hi,

I did the upgrade from 14.11.6 to 16.05.5 like you suggested everything worked okay except from one problem that came up. When running the sacctmgr command to add the cluster I used the cluster name in the slurm.conf file.

Clustername in slurm.conf was lphc so I typed in:

'sacctmgr add cluster lphc' 

That resulted in errors and I was unable to use this name. So I decided to use a different cluster name,

'sacctmgr add cluster lhpc' 

That worked but now we do not have any history when we run sacct command. The only way for us to get old information is to go to the old server and start slurm service there and run the command there. 

But we would of course like to be able to run the sacct command on our upgraded system and get all the old information as well.

Is there any way for us to run a command on the database that updates the records so the correct clustername is set, like f.e. 

'UPDATE table_name SET column1=value1 WHERE some_column=some_value;'

Please let me know if we can somehow fix this.

regards,
Hjalti Sveinsson

Comment 9 Tim Wickberg 2016-10-12 09:34:44 MDT

(In reply to Hjalti Sveinsson from comment #8)
> Hi,
> 
> I did the upgrade from 14.11.6 to 16.05.5 like you suggested everything
> worked okay except from one problem that came up. When running the sacctmgr
> command to add the cluster I used the cluster name in the slurm.conf file.
> 
> Clustername in slurm.conf was lphc so I typed in:
> 
> 'sacctmgr add cluster lphc' 
> 
> That resulted in errors and I was unable to use this name. So I decided to
> use a different cluster name,
> 
> 'sacctmgr add cluster lhpc' 
> 
> That worked but now we do not have any history when we run sacct command.
> The only way for us to get old information is to go to the old server and
> start slurm service there and run the command there. 
>
> But we would of course like to be able to run the sacct command on our
> upgraded system and get all the old information as well.
> 
> Is there any way for us to run a command on the database that updates the
> records so the correct clustername is set, like f.e. 
> 
> 'UPDATE table_name SET column1=value1 WHERE some_column=some_value;'
> 
> Please let me know if we can somehow fix this.

I'm going to look into what may have failed when trying to use the original cluster name, although it sounds like you're past that being an active issue.

If you can live with the 'old' data being split in the database, you can use the -M flag with sacct to query the old cluster's data:

sacct -M lphc <rest of options>

Note that you do not need to have slurmctld running under that old cluster name - sacct is directly querying slurmdbd, which you only need the single instance of.

Recombining the records into a single cluster is going to be difficult. Job records are stored in independent tables ($clustername_job_table), and each table has its own auto-incrementing primary key (job_db_inx) that would need to be remapped to prevent collision between the old data and new. That key is also referenced on the controller.

It's theoretically possible, but not something I've tried before, and you'd likely need to remove all pending and running jobs from the database to avoid potential accounting data corruption.

Comment 10 Hjalti Sveinsson 2016-10-13 09:24:49 MDT

Hi, thanks for your respone.

when i type in "sacct -M lphc" I get this error:

[root@ru-lhpc-head ~]# sacct -M lphc
       JobID    JobName  Partition    Account  AllocCPUS      State ExitCode 
------------ ---------- ---------- ---------- ---------- ---------- -------- 
sacct: error: Unknown error 1054

If we can get that working somehow it would be great and then we don't need to change the cluster name of all the database records.

However we saw that the jobid's kept on rolling and they did not go back to zero and start again. 

regards,
Hjalti Sveinsson

Comment 11 Tim Wickberg 2016-10-13 09:46:56 MDT

(In reply to Hjalti Sveinsson from comment #10)
> Hi, thanks for your respone.
> 
> when i type in "sacct -M lphc" I get this error:
> 
> [root@ru-lhpc-head ~]# sacct -M lphc
>        JobID    JobName  Partition    Account  AllocCPUS      State ExitCode 
> ------------ ---------- ---------- ---------- ---------- ---------- -------- 
> sacct: error: Unknown error 1054

Is there anything in the slurmdbd.log corresponding to that query?

Unfortunately I think this is going to require some direct intervention in MySQL to sort out.

Can you run a few queries?

show tables;
select * from cluster_table;

I suspect that lphc is missing from cluster_table, but that its tables should all be in place.

> If we can get that working somehow it would be great and then we don't need
> to change the cluster name of all the database records.
> 
> However we saw that the jobid's kept on rolling and they did not go back to
> zero and start again. 

That's normal - JobIDs come from slurmctld, and aren't directly linked to the job_db_inx values which the MySQL database auto-generates.

Comment 12 Kolbeinn Josepsson 2016-10-13 11:23:50 MDT

Hi Tim, following is the error from slurmdb.log:

[2016-10-13T15:19:39.084] error: mysql_query failed: 1054 Unknown column 't1.req_cpufreq_min' in 'field list'
select t1.id_step, t1.time_start, t1.time_end, t1.time_suspended, t1.step_name, t1.nodelist, t1.node_inx, t1.state, t1.kill_requid, t1.exit_code, t1.nodes_alloc, t1.task_cnt, t1.task_dist, t1.user_sec, t1.user_usec, t1.sys_sec, t1.sys_usec, t1.max_disk_read, t1.max_disk_read_task, t1.max_disk_read_node, t1.ave_disk_read, t1.max_disk_write, t1.max_disk_write_task, t1.max_disk_write_node, t1.ave_disk_write, t1.max_vsize, t1.max_vsize_task, t1.max_vsize_node, t1.ave_vsize, t1.max_rss, t1.max_rss_task, t1.max_rss_node, t1.ave_rss, t1.max_pages, t1.max_pages_task, t1.max_pages_node, t1.ave_pages, t1.min_cpu, t1.min_cpu_task, t1.min_cpu_node, t1.ave_cpu, t1.act_cpufreq, t1.consumed_energy, t1.req_cpufreq_min, t1.req_cpufreq, t1.req_cpufreq_gov, t1.tres_alloc from "lphc_step_table" as t1 where t1.job_db_inx=1336592
[2016-10-13T15:19:39.084] error: Problem getting jobs for cluster lphc
[2016-10-13T15:19:39.084] error: Processing last message from connection 8(172.17.147.210) uid(0)

Here are the results from mysql queries:

MariaDB [slurm_acc]> show tables;
+------------------------------+
| Tables_in_slurm_acc          |
+------------------------------+
| acct_coord_table             |
| acct_table                   |
| clus_res_table               |
| cluster_table                |
| lhpc_assoc_table             |
| lhpc_assoc_usage_day_table   |
| lhpc_assoc_usage_hour_table  |
| lhpc_assoc_usage_month_table |
| lhpc_event_table             |
| lhpc_job_table               |
| lhpc_last_ran_table          |
| lhpc_resv_table              |
| lhpc_step_table              |
| lhpc_suspend_table           |
| lhpc_usage_day_table         |
| lhpc_usage_hour_table        |
| lhpc_usage_month_table       |
| lhpc_wckey_table             |
| lhpc_wckey_usage_day_table   |
| lhpc_wckey_usage_hour_table  |
| lhpc_wckey_usage_month_table |
| lphc_assoc_table             |
| lphc_assoc_usage_day_table   |
| lphc_assoc_usage_hour_table  |
| lphc_assoc_usage_month_table |
| lphc_event_table             |
| lphc_job_table               |
| lphc_last_ran_table          |
| lphc_resv_table              |
| lphc_step_table              |
| lphc_suspend_table           |
| lphc_usage_day_table         |
| lphc_usage_hour_table        |
| lphc_usage_month_table       |
| lphc_wckey_table             |
| lphc_wckey_usage_day_table   |
| lphc_wckey_usage_hour_table  |
| lphc_wckey_usage_month_table |
| qos_table                    |
| res_table                    |
| table_defs_table             |
| tres_table                   |
| txn_table                    |
| user_table                   |
+------------------------------+

MariaDB [slurm_acc]> select * from cluster_table;
+---------------+------------+---------+------+----------------+--------------+-----------+-------------+----------------+------------+------------------+-------+
| creation_time | mod_time   | deleted | name | control_host   | control_port | last_port | rpc_version | classification | dimensions | plugin_id_select | flags |
+---------------+------------+---------+------+----------------+--------------+-----------+-------------+----------------+------------+------------------+-------+
|    1475531478 | 1475531594 |       0 | lhpc | 172.17.147.210 |         6817 |      6817 |        7680 |              0 |          1 |              101 |     0 |
+---------------+------------+---------+------+----------------+--------------+-----------+-------------+----------------+------------+------------------+-------+

Rgds, Kolbeinn

Comment 14 Kolbeinn Josepsson 2016-10-20 04:31:40 MDT

Seems like this case has lost attention?

Comment 15 Tim Wickberg 2016-10-20 16:47:50 MDT

It looks like the lphc tables haven't been automatically converted to the latest format, which is then causing the MySQL queries to fail. I believe slurmdbd hasn't converted the tables as it isn't included in the cluster list.

Can you try to add the 'lphc' cluster again through sacctmgr? I'm assuming that will fail again, but if you can attach the slurmdbd.log when you do that it'd help isolate the main issue.

Comment 16 Hjalti Sveinsson 2016-10-21 03:18:36 MDT

[root@ru-lhpc-head ~]# sacctmgr add cluster lphc
 Adding Cluster(s)
  Name          = lphc
Would you like to commit changes? (You have 30 seconds to decide)
(N/y): y
 Database is busy or waiting for lock from other user.

Comment 17 Hjalti Sveinsson 2016-10-21 04:08:14 MDT

There was some output missing:

[root@ru-lhpc-head ~]# sacctmgr add cluster lphc
 Adding Cluster(s)
  Name          = lphc
Would you like to commit changes? (You have 30 seconds to decide)
(N/y): y
 Database is busy or waiting for lock from other user.
sacctmgr: slurmdbd: reopening connection
sacctmgr: error: slurmdbd: Getting response to message type 1405
 Problem adding clusters: Unspecified error

I have attached the slurmdbd.log as well.

regards,
Hjalti Sveinsson

Comment 18 Hjalti Sveinsson 2016-10-21 04:09:15 MDT

Created attachment 3626 [details]
slurmdbd log file

slurmdbd log file

Comment 19 Hjalti Sveinsson 2016-11-14 09:17:03 MST

Any update on this?

would be great to get this resolved.

regards,
Hjalti Sveinsson

Comment 20 Danny Auble 2016-11-14 10:29:58 MST

Hi Hjaiti,

I think the problem here is the upgrade plus the switch from mysql directly to using the dbd.  It looks like this could had been avoided by adding the cluster to the database before hand, which isn't required when writing directly to the dbd.

It looks like after you added the cluster things started working though

[2016-10-21T09:54:30.434] adding column req_cpufreq_min after consumed_energy in table "lphc_step_table"

Based on the logs I would expect lphc to now be doing the correct thing.  Is that the case?

If that isn't the case could you send again the output of 

sacctmgr list clusters?  Or the output from the direct mysql query

select * from cluster_table;

It looks like the reason your sacctmgr add cluster lphc failed is the query took too long because of the table upgrades  Based on the logs it took almost an hour to do.

[2016-10-21T09:15:55.003] dropping key sacct from table "lphc_job_table"
[2016-10-21T10:08:10.786] dropping column consumed_energy from table "lphc_wckey_usage_month_table"

We can update the documentation to point to adding the cluster before switching plugins.  I think the problem would had been resolved making that happen.

Let us know if there is something outstanding on this.

Comment 21 Danny Auble 2016-11-14 10:48:41 MST

Documentation has been updated in commit df00db73d.

Comment 22 Danny Auble 2016-11-21 13:33:06 MST

Hjaiti, is there anything else you need on this?

Comment 23 Danny Auble 2016-11-28 15:35:11 MST

Please reopen if you have anything else needed on this.