| Summary: | use submit plugin to set different default partitions depending on login node | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Steve McMahon <steve.mcmahon> |
| Component: | Other | Assignee: | Moe Jette <jette> |
| Status: | RESOLVED FIXED | QA Contact: | |
| Severity: | 3 - Medium Impact | ||
| Priority: | High | CC: | brian, da, steve.mcmahon |
| Version: | 14.03.0 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | CSIRO | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | 14.11.4 | Target Release: | --- |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
|
Description
Steve McMahon
2015-01-18 10:41:27 MST
(In reply to Steve McMahon from comment #0) > Hi, > > We have a design which has 3 "logical clusters" with one instance of the > slurm server and Bright Cluster Manager to manage the nodes. The logical > clusters have different architectures - CPU, GPU or PHI - different login > nodes and different partitions. > > We want to use a submit plugin to set different default partitions depending > on what login node the job is submitted from. > > What's the best way to do this? The job submit plugin has access to all of the job parameters. Doing what you want should be pretty simple. There are several samples available that you can use as a model. If you want to do this using a LUA script, take a look at contribs/lua/job_submit.lua packaged with Slurm or online here: https://github.com/SchedMD/slurm/blob/master/contribs/lua/job_submit.lua If you prefer to use C, then see src/plugins/job_submit/partition/job_submit_partition.c https://github.com/SchedMD/slurm/blob/master/src/plugins/job_submit/partition/job_submit_partition.c or src/plugins/job_submit/all_partitions/job_submit_all_partitions.c https://github.com/SchedMD/slurm/blob/master/src/plugins/job_submit/all_partitions/job_submit_all_partitions.c > Also, we are still learning about running production clusters using slurm. > What's the best way to test this functionality without affecting production > use? I would recommend building a configuration on your desktop that you can use for emulating your systems, just don't try to run a bunch of big parallel jobs ;), submit jobs that just sleep. I would recommend the "front end" configuration described here: http://slurm.schedmd.com/faq.html#multi_slurmd Thanks Moe, We have developed a LUA script. We don’t know the name of the parameter which has the host name of the node the job was submitted from. Will it be something like job_desc.AllocNode ? (In reply to Steve McMahon from comment #2) > Thanks Moe, > > We have developed a LUA script. We don’t know the name of the parameter > which has the host name of the node the job was submitted from. Will it be > something like job_desc.AllocNode ? It should be job_desc.alloc_node You will find brief descriptions of all of the names in slurm/slurm.h.in Look starting around line 1127 here: https://github.com/SchedMD/slurm/blob/master/slurm/slurm.h.in Unfortunately, that variable doesn't seem to be getting exported to Lua today. I'll need to send you a patch for that. If you want to have a crack at making the patch yourself, it should be trivial, see the _get_job_req_field() function in plugins/job_submit/lua/job_submit_lua.c around line 476 of the file: https://github.com/SchedMD/slurm/blob/master/src/plugins/job_submit/lua/job_submit_lua.c It should just take two lines being added: } else if (!strcmp(name, "alloc_node")) { lua_pushstring (L, job_desc->alloc_node); (In reply to Moe Jette from comment #3) > It should just take two lines being added: > } else if (!strcmp(name, "alloc_node")) { > lua_pushstring (L, job_desc->alloc_node); This will be in v14.11.4 when released. The commit is here: https://github.com/SchedMD/slurm/commit/85b3cc2db4a2cffda9b35a6db86b2b7b9f5f5203 Can we close this ticket? (In reply to Moe Jette from comment #5) > Can we close this ticket? Yes, thanks. We have enough to go on now. Closed based upon information provided to customer and Slurm patch |