1379 – use submit plugin to set different default partitions depending on login node

Ticket 1379 - use submit plugin to set different default partitions depending on login node

Summary: use submit plugin to set different default partitions depending on login node

Status:	RESOLVED FIXED

Alias:	None

Product:	Slurm
Classification:	Unclassified
Component:	Other (show other tickets)
Version:	14.03.0
Hardware:	Linux Linux

Severity:	3 - Medium Impact
Assignee:	Moe Jette
QA Contact:

URL:

Depends on:
Blocks:

Reported:	2015-01-18 10:41 MST by Steve McMahon
Modified:	2015-01-20 08:40 MST (History)
CC List:	3 users (show)

See Also:
Site:	CSIRO
Slinky Site:	---
Alineos Sites:	---
Atos/Eviden Sites:	---
Confidential Site:	---
Coreweave sites:	---
Cray Sites:	---
DS9 clusters:	---
Google sites:	---
HPCnow Sites:	---
HPE Sites:	---
IBM Sites:	---
NOAA SIte:	---
NoveTech Sites:	---
Nvidia HWinf-CS Sites:	---
OCF Sites:	---
Recursion Pharma Sites:	---
SFW Sites:	---
SNIC sites:	---
Tzag Elita Sites:	---
Linux Distro:	---
Machine Name:
CLE Version:
Version Fixed:	14.11.4
Target Release:	---
DevPrio:	---
Emory-Cloud Sites:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this ticket.

Description Steve McMahon 2015-01-18 10:41:27 MST

Hi,

We have a design which has 3 "logical clusters" with one instance of the slurm server and Bright Cluster Manager to manage the nodes.  The logical clusters have different architectures - CPU, GPU or PHI - different login nodes and different partitions.

We want to use a submit plugin to set different default partitions depending on what login node the job is submitted from.

What's the best way to do this?

Also, we are still learning about running production clusters using slurm.  What's the best way to test this functionality without affecting production use?

Comment 1 Moe Jette 2015-01-18 12:55:37 MST

(In reply to Steve McMahon from comment #0)
> Hi,
> 
> We have a design which has 3 "logical clusters" with one instance of the
> slurm server and Bright Cluster Manager to manage the nodes.  The logical
> clusters have different architectures - CPU, GPU or PHI - different login
> nodes and different partitions.
> 
> We want to use a submit plugin to set different default partitions depending
> on what login node the job is submitted from.
> 
> What's the best way to do this?

The job submit plugin has access to all of the job parameters. Doing what you want should be pretty simple. There are several samples available that you can use as a model. If you want to do this using a LUA script, take a look at contribs/lua/job_submit.lua packaged with Slurm or online here:
https://github.com/SchedMD/slurm/blob/master/contribs/lua/job_submit.lua

If you prefer to use C, then see src/plugins/job_submit/partition/job_submit_partition.c
https://github.com/SchedMD/slurm/blob/master/src/plugins/job_submit/partition/job_submit_partition.c

or src/plugins/job_submit/all_partitions/job_submit_all_partitions.c
https://github.com/SchedMD/slurm/blob/master/src/plugins/job_submit/all_partitions/job_submit_all_partitions.c

> Also, we are still learning about running production clusters using slurm. 
> What's the best way to test this functionality without affecting production
> use?

I would recommend building a configuration on your desktop that you can use for emulating your systems, just don't try to run a bunch of big parallel jobs ;), submit jobs that just sleep. I would recommend the "front end" configuration described here:
http://slurm.schedmd.com/faq.html#multi_slurmd

Comment 2 Steve McMahon 2015-01-18 15:11:16 MST

Thanks Moe,

We have developed a LUA script.  We don’t know the name of the parameter which has the host name of the node the job was submitted from.  Will it be something like job_desc.AllocNode ?

Comment 3 Moe Jette 2015-01-18 15:25:15 MST

(In reply to Steve McMahon from comment #2)
> Thanks Moe,
> 
> We have developed a LUA script.  We don’t know the name of the parameter
> which has the host name of the node the job was submitted from.  Will it be
> something like job_desc.AllocNode ?

It should be job_desc.alloc_node

You will find brief descriptions of all of the names in slurm/slurm.h.in
Look starting around line 1127 here:
https://github.com/SchedMD/slurm/blob/master/slurm/slurm.h.in

Unfortunately, that variable doesn't seem to be getting exported to Lua today. I'll need to send you a patch for that. If you want to have a crack at making the patch yourself, it should be trivial, see the
_get_job_req_field() function in plugins/job_submit/lua/job_submit_lua.c around line 476 of the file:
https://github.com/SchedMD/slurm/blob/master/src/plugins/job_submit/lua/job_submit_lua.c

It should just take two lines being added:
} else if (!strcmp(name, "alloc_node")) {
lua_pushstring (L, job_desc->alloc_node);

Comment 4 Moe Jette 2015-01-19 04:44:26 MST

(In reply to Moe Jette from comment #3)
> It should just take two lines being added:
> } else if (!strcmp(name, "alloc_node")) {
> lua_pushstring (L, job_desc->alloc_node);

This will be in v14.11.4 when released. The commit is here:
https://github.com/SchedMD/slurm/commit/85b3cc2db4a2cffda9b35a6db86b2b7b9f5f5203

Comment 5 Moe Jette 2015-01-20 02:24:48 MST

Can we close this ticket?

Comment 6 Steve McMahon 2015-01-20 08:10:06 MST

(In reply to Moe Jette from comment #5)
> Can we close this ticket?

Yes, thanks.  We have enough to go on now.

Comment 7 Moe Jette 2015-01-20 08:40:48 MST

Closed based upon information provided to customer and Slurm patch