Hi, We have a design which has 3 "logical clusters" with one instance of the slurm server and Bright Cluster Manager to manage the nodes. The logical clusters have different architectures - CPU, GPU or PHI - different login nodes and different partitions. We want to use a submit plugin to set different default partitions depending on what login node the job is submitted from. What's the best way to do this? Also, we are still learning about running production clusters using slurm. What's the best way to test this functionality without affecting production use?
(In reply to Steve McMahon from comment #0) > Hi, > > We have a design which has 3 "logical clusters" with one instance of the > slurm server and Bright Cluster Manager to manage the nodes. The logical > clusters have different architectures - CPU, GPU or PHI - different login > nodes and different partitions. > > We want to use a submit plugin to set different default partitions depending > on what login node the job is submitted from. > > What's the best way to do this? The job submit plugin has access to all of the job parameters. Doing what you want should be pretty simple. There are several samples available that you can use as a model. If you want to do this using a LUA script, take a look at contribs/lua/job_submit.lua packaged with Slurm or online here: https://github.com/SchedMD/slurm/blob/master/contribs/lua/job_submit.lua If you prefer to use C, then see src/plugins/job_submit/partition/job_submit_partition.c https://github.com/SchedMD/slurm/blob/master/src/plugins/job_submit/partition/job_submit_partition.c or src/plugins/job_submit/all_partitions/job_submit_all_partitions.c https://github.com/SchedMD/slurm/blob/master/src/plugins/job_submit/all_partitions/job_submit_all_partitions.c > Also, we are still learning about running production clusters using slurm. > What's the best way to test this functionality without affecting production > use? I would recommend building a configuration on your desktop that you can use for emulating your systems, just don't try to run a bunch of big parallel jobs ;), submit jobs that just sleep. I would recommend the "front end" configuration described here: http://slurm.schedmd.com/faq.html#multi_slurmd
Thanks Moe, We have developed a LUA script. We don’t know the name of the parameter which has the host name of the node the job was submitted from. Will it be something like job_desc.AllocNode ?
(In reply to Steve McMahon from comment #2) > Thanks Moe, > > We have developed a LUA script. We don’t know the name of the parameter > which has the host name of the node the job was submitted from. Will it be > something like job_desc.AllocNode ? It should be job_desc.alloc_node You will find brief descriptions of all of the names in slurm/slurm.h.in Look starting around line 1127 here: https://github.com/SchedMD/slurm/blob/master/slurm/slurm.h.in Unfortunately, that variable doesn't seem to be getting exported to Lua today. I'll need to send you a patch for that. If you want to have a crack at making the patch yourself, it should be trivial, see the _get_job_req_field() function in plugins/job_submit/lua/job_submit_lua.c around line 476 of the file: https://github.com/SchedMD/slurm/blob/master/src/plugins/job_submit/lua/job_submit_lua.c It should just take two lines being added: } else if (!strcmp(name, "alloc_node")) { lua_pushstring (L, job_desc->alloc_node);
(In reply to Moe Jette from comment #3) > It should just take two lines being added: > } else if (!strcmp(name, "alloc_node")) { > lua_pushstring (L, job_desc->alloc_node); This will be in v14.11.4 when released. The commit is here: https://github.com/SchedMD/slurm/commit/85b3cc2db4a2cffda9b35a6db86b2b7b9f5f5203
Can we close this ticket?
(In reply to Moe Jette from comment #5) > Can we close this ticket? Yes, thanks. We have enough to go on now.
Closed based upon information provided to customer and Slurm patch