Ticket 7015

Summary: Add partition to an account or user
Product: Slurm Reporter: Bruno Mundim <bmundim>
Component: AccountingAssignee: Marcin Stolarek <cinek>
Status: RESOLVED INFOGIVEN QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: albert.gil, jbooth
Version: 17.11.12   
Hardware: Linux   
OS: Linux   
Site: SciNet Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description Bruno Mundim 2019-05-14 10:31:40 MDT
I am trying to add a new account to the database such that it is restricted to submit jobs only to our archival partitions: vfsshort,archiveshort,archivelong. I tried the following command line (and a few extra combinations with set/where as well) without success:

$ sacctmgr add account name=hpss_test partition=vfsshort,archiveshort,archivelong
 Unknown option: partition=vfsshort,archiveshort,archivelong
 Use keyword 'where' to modify condition

I have also tried to add a account/user with the partition restriction using the file format and the load option from sacctmgr:

sacctmgr load cluster=niagara file=fairshare_hpss_rpp

where fairshare_hpss_rpp reads:

Cluster - 'niagara':FairShare=1:QOS='normal'
Parent - 'root'
User - 'root':AdminLevel='Administrator':DefaultAccount='root':FairShare=0
Account - 'rpp-test':FairShare=0:Partition=vfsshort,archiveshort,archivelong:Description='test':Organization='rpp'

Parent - 'rpp-test'
User - 'test_user'

None seems to add the Partition attribute either to the account or user. I am wondering if this is a bug or if I am doing something wrong on the two ways I tried above. Please advice.

Thanks,
Bruno.
Comment 3 Marcin Stolarek 2019-05-15 07:04:59 MDT
Bruno, 

I took a look into your issue. We'll further check if we can improve this behavior, since it's aligned with manual. However, more common approach to achieve your goal is to create associations working on "user" entity, this way you should execute commands like:
> sacctmgr add account name=hpss_test
> sacctmgr add user test3  account=hpss_test partition=vfsshort,archiveshort,archivelong cluster=niagara


after that you can check how the dump looks like:
># sacctmgr dump cluster=niagara
> No filename given, using ./niagara.cfg.
>sacctmgr: Cluster - 'niagara':Fairshare=1:QOS='normal'
>sacctmgr: Parent - 'root'
>sacctmgr: User - 'root':DefaultAccount='root':AdminLevel='Administrator':Fairshare=1
>sacctmgr: Account - 'hpss_test':Description='hpss_test':Organization='hpss_test':Fairshare=1
>sacctmgr: Parent - 'hpss_test'
>sacctmgr: User - 'test3':Partition='vfsshort':DefaultAccount='hpss_test':Fairshare=1
>sacctmgr: User - 'test3':Partition='archiveshort':DefaultAccount='hpss_test':Fairshare=1
>sacctmgr: User - 'test3':Partition='archivelong':DefaultAccount='hpss_test':Fairshare=1

As you can see correct dump contains a separate line per association. It doesn't matter if you add them with one command using comma separated partitions (or clusters) or with multiple commands either way it will end-up being separate associations.

Let me know if that works for you.

cheers,
Marcin
Comment 4 Bruno Mundim 2019-05-15 09:32:50 MDT
Thank you very much, Marcin! It worked gracefully. I tried both command line and loading the file. Both worked fine. I have two suggestions for improvement though:

1) Allow setting the partition for the account instead of user only. This way the user could inherit that attribute from her parent account.

2) Apparently trying to modify a user already added to the database doesn't work. I could not set the partition attribute once the user was already in. I had to delete that user and add her again with the command line you suggested.

Thanks,
Bruno.
Comment 8 Marcin Stolarek 2019-05-17 03:18:07 MDT
Bruno, 

Please keep in mind that
>Slurm account information is recorded  based  upon  four  parameters that 
>form what is referred to as an association.  These parameters are user,
>cluster, partition, and account. user is the login name.

Following this the most important entity you have to think about in regards to slurm accounting is the association of the four mentioned parameters, so every time you create/delete/modify account,user or cluster you're really changing associations. For instance:

># sacctmgr add user test accounts=hpss_test,testa
> This account 'testa' doesn't exist on cluster niagara
>        Contact your admin to add this account.
> Adding User(s)
>  test
> Associations =
>  U = test      A = hpss_test  C = niagara   
>  U = test      A = hpss_test  C = test      
>  U = test      A = testa      C = test      
> Non Default Settings
>Would you like to commit changes? (You have 30 seconds to decide)
>(N/y): y

Attempts to create associations on all configured clusters. It fails to add association of "testa" account on niagara since this account is not configured there. As you see it creates three associations, when you decide to withdraw access for the user to test cluster you can issue:
># sacctmgr delete user test where cluster=test
> Deleting user associations...
>  C = test       A = hpss_test  U = test     
>  C = test       A = testa      U = test     
>Would you like to commit changes? (You have 30 seconds to decide)
>(N/y): Y
Above command removes two associations from the three created by the previous command. 

If you're looking for an easier way to limit access to certain partitions for specific accounts you may also take a look at AllowAccounts/DenyAccounts options in slurm.conf[1]. Using it you can create only one association for all partitions (just omitting it in sacctmgr create command). 

Please let me know if I can close the ticket. 

cheers,
Marcin 

[1] https://slurm.schedmd.com/slurm.conf.html
Comment 9 Bruno Mundim 2019-05-21 08:09:19 MDT
Hi Marcin,

(In reply to Marcin Stolarek from comment #8)
> Bruno, 
> 
> Please keep in mind that
> >Slurm account information is recorded  based  upon  four  parameters that 
> >form what is referred to as an association.  These parameters are user,
> >cluster, partition, and account. user is the login name.
> 
> Following this the most important entity you have to think about in regards
> to slurm accounting is the association of the four mentioned parameters, so
> every time you create/delete/modify account,user or cluster you're really
> changing associations. For instance:
> 
> ># sacctmgr add user test accounts=hpss_test,testa
> > This account 'testa' doesn't exist on cluster niagara
> >        Contact your admin to add this account.
> > Adding User(s)
> >  test
> > Associations =
> >  U = test      A = hpss_test  C = niagara   
> >  U = test      A = hpss_test  C = test      
> >  U = test      A = testa      C = test      
> > Non Default Settings
> >Would you like to commit changes? (You have 30 seconds to decide)
> >(N/y): y
> 
> Attempts to create associations on all configured clusters. It fails to add
> association of "testa" account on niagara since this account is not
> configured there. As you see it creates three associations, when you decide
> to withdraw access for the user to test cluster you can issue:
> ># sacctmgr delete user test where cluster=test
> > Deleting user associations...
> >  C = test       A = hpss_test  U = test     
> >  C = test       A = testa      U = test     
> >Would you like to commit changes? (You have 30 seconds to decide)
> >(N/y): Y
> Above command removes two associations from the three created by the
> previous command. 
> 

Very interesting your explanations. Thanks! Is it possible to configure
two clusters on the same slurm.conf file? It seems clear that the database
server will hold data from several clusters database, but I was wondering 
if I really need to split the slurm.conf and add extra hardware for running 
slurmctld and slurmdbd for possibly another cluster. For example, if I were to 
split Niagara configuration into compute and archive clusters, would it be 
possible to run only one slurmctld and slurmdbd as we run just one database
server?

> If you're looking for an easier way to limit access to certain partitions
> for specific accounts you may also take a look at AllowAccounts/DenyAccounts
> options in slurm.conf[1]. Using it you can create only one association for
> all partitions (just omitting it in sacctmgr create command). 
> 

I looked into this option, but the problem is that the number of accounts
that eventually will need to be restricted will change with time and that
would force us to modify slurm.conf and reconfigure which I find a bit 
disruptive. So I would rather deal with associations and QOSs directly.

> Please let me know if I can close the ticket. 
> 

Yes, please go ahead and close the ticket.

Thanks,
Bruno.

> cheers,
> Marcin 
> 
> [1] https://slurm.schedmd.com/slurm.conf.html
Comment 11 Marcin Stolarek 2019-05-22 04:02:35 MDT
>Is it possible to configure two clusters on the same slurm.conf file?

The question here is really what do you mean by a separate cluster. One of important aspects is the possibility to split administrative domains - munge keys for different clusters. It's importan when you have different administrators, but still want to share one accounting database. 

In general you can manage heterogenous environments, with non-uniform networks as one Slurm cluster. Obviously, configuration details will differ depending on your needs.

If you find new issues, feel free to open another ticket. 


cheers,
Marcin