Ticket 3635

Summary: How to submit jobs for the users in a group who has been given exclusive access to two nodes
Product: Slurm Reporter: NYU HPC Team <hpc-staff>
Component: User CommandsAssignee: Alejandro Sanchez <alex>
Status: RESOLVED INFOGIVEN QA Contact:
Severity: 3 - Medium Impact    
Priority: --- CC: alex
Version: 16.05.4   
Hardware: Linux   
OS: Linux   
Site: NYU Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description NYU HPC Team 2017-03-29 12:41:37 MDT
Hi Slurm Experts:

We have made a reservation of two GPU nodes to a research group, so that the group members have exclusive access to these two nodes. The arrangement is:
- when there are free slots on the two nodes, the group members can run jobs there; 
- when the two nodes are full, the group members will use other nodes with the same priority as general public.

What job submission command should be used? To use a reservation, the group member should run: 
$ sbatch --reservation=<ResvName> ......
When the reserved nodes are full, it seems that they have to run sbatch again without the --reservation= option? 

We are in plan to upgrade to 17.02. Is there any feature in the new version helpful in this case?

The goal is to simplify the user command interface, at same time giving exclusive access to two nodes with a specific type of GPU cards. 

Thanks A Lot!
Comment 1 Alejandro Sanchez 2017-03-30 03:18:25 MDT
Your reservation approach is one valid option, but yes the first sbatch submission with --reservation=<ResvName> will try to allocate the job in the resources assigned to that reservation. When the nodes are full, this job remains PD until it has enough priority and resources to be run. If in the meantime a user from the research group wants a different job to utilize resources outside the reservation, the user needs to effectively make a different submission.

I'd propose a different alternative which might help you achieve what you want. I'd create a partition containing these two GPU nodes, and no other partition should contain them too. This partition can be configured with AllowAccounts or AllowGroups parameter so that you decide who can execute jobs in this partition. Then, I'd configure all the nodes in the system with a Weight, so that the two GPU nodes have a lower Weight than the rest of the nodes in the system. Note that if a job allocation request can not be satisfied using the nodes with the lowest weight, the set of nodes with the next lowest weight is added to the set of nodes under consideration for use. Then, the research group users could make a single job submission such as 'sbatch --partition=twogpunodes,otherpart1,otherpart2' and their jobs will first try to be executed in the two GPU nodes partition and if they are not able to run the job will be considered to be scheduled in the rest of the requested partitions. You could also create a job_submit plugin so that the system automatically routes the job requests to their right partitions depending on the account/group doing the job submission[1]. Somehow a rule such as 'if this job is requested from a user belonging to the research group, modify the job's --partition to <twoGPUnodepart>,<part1>,...,<partN>'. This way users don't even need to think about requesting this or that partition themselves (taking into account that your goal is to simplify the user command interface).

[1] https://slurm.schedmd.com/job_submit_plugins.html

Some more examples can be found under:

https://github.com/SchedMD/slurm/blob/master/contribs/lua
Comment 2 NYU HPC Team 2017-03-30 07:22:55 MDT
Nice, thanks! I understand the the accounts in AllowAccounts should be these existing in Slurm, and similar to bank accounts; but am not sure if the groups in AllowGroups are these linux groups on the nodes. 
$ groups --help
Usage: groups [OPTION]... [USERNAME]...
Print group memberships for each USERNAME or, if no USERNAME is specified, for
the current process (which may differ if the groups database has changed).
      --help     display this help and exit
      --version  output version information and exit

GNU coreutils online help: <http://www.gnu.org/software/coreutils/>
For complete documentation, run: info coreutils 'groups invocation'


Also a partition can be configured with the AllowQos parameter. Can a QoS be restricted to be usable by a few selected users only?
Comment 3 Alejandro Sanchez 2017-03-30 07:44:08 MDT
(In reply to NYU HPC Team from comment #2)
> Nice, thanks! I understand the the accounts in AllowAccounts should be these
> existing in Slurm, and similar to bank accounts; but am not sure if the
> groups in AllowGroups are these linux groups on the nodes. 
> $ groups --help
> Usage: groups [OPTION]... [USERNAME]...
> Print group memberships for each USERNAME or, if no USERNAME is specified,
> for
> the current process (which may differ if the groups database has changed).
>       --help     display this help and exit
>       --version  output version information and exit
> 
> GNU coreutils online help: <http://www.gnu.org/software/coreutils/>
> For complete documentation, run: info coreutils 'groups invocation'
> 

Yes, the list of groups in AllowGroups refer to the linux groups in the nodes (e.g., the local group file /etc/group, NIS, and LDAP) that matches the group names.
 
> Also a partition can be configured with the AllowQos parameter. Can a QoS be
> restricted to be usable by a few selected users only?

You set the QoS paramter to an association. From the sacctmgr man page:

Qos    Valid QOS´ for this association.
Comment 4 NYU HPC Team 2017-03-30 07:53:24 MDT
Yes, but it seems that setting the QoS paramter to an association does not prevent users not in the association from using the QoS.
Comment 5 Alejandro Sanchez 2017-03-30 08:16:07 MDT
(In reply to NYU HPC Team from comment #4)
> Yes, but it seems that setting the QoS paramter to an association does not
> prevent users not in the association from using the QoS.

If you append 'qos' to AccountingStorageEnforce param in slurm.conf, it will prevent users not in the association from using that QOS. The client submission command should receive an error like this:

error: Unable to allocate resources: Invalid qos specification

and you'll note this message in the slurmctld.log:

slurmctld: error: This association 6(account='a1', user='test2', partition='(null)') does not have access to qos test
Comment 6 Alejandro Sanchez 2017-04-03 07:18:59 MDT
Hi. Is there anything else we can assist you with this bug? Thanks.
Comment 7 NYU HPC Team 2017-04-04 10:43:37 MDT
Not now, thank you Alejandro!