Ticket 12748

Summary: T102812072 & T102812005 Fairtree implementation" asking SchedMD for guidance on enabling this on our cluster
Product: Slurm Reporter: Louis Ekpenyong <lekpenyong>
Component: SchedulingAssignee: Ben Roberts <ben>
Status: RESOLVED INFOGIVEN QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: calebh, kwhetham
Version: 20.11.8   
Hardware: Linux   
OS: Linux   
Site: FB (PSLA) Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description Louis Ekpenyong 2021-10-26 12:29:44 MDT
The goal of this RFC is to evaluate the technical feasibility of using fairtree; which allows users to pool their karma (fairshare), and instead of deducting karma for individual users, allow them to run job from a shared account that spreads karma usage across all team members.
    Proposed implementation for the FAIR Accel is as follows:
    Enable both user-level and project level accounts that for each partition that exists for FAIR Accel
    Karma (fairshare) values are assigned at both the user and project level; where the project-level karma value is the aggregate of the karma values of the individual team members 
    Users can specify which accounts to use when running jobs
    user account - karma values is deducted from the users own quota
    project/team account - deduction of karma is evenly spread across all the users on the team, thereby avoiding the scenario where individual users are penalized
    Proposed implementation for the FAIR Labs is as follows:
    FAIR Labs have shorter life-cycles when compared to FAIR Accel
    To support this working model, users must be able to spin up account (and consequently, karma) sharing in an on-demand fashion.
    Proposed implementation:
    Individuals in Labs have the following

    their own account and
    coordinator/operator privileges.
Comment 1 Ben Roberts 2021-10-26 16:08:11 MDT
Hi Louis,

What you are describing sounds like it should be feasible.  There are some subtleties with how accounts interact with other accounts that might affect how you want to define the tree in your environment.  

I've put together an example of how you might do things based on your description.  I created two main accounts (project1 and project2) and put 3 users in each of these primary accounts.  I gave each of the users in these accounts a 'Share' value of 'parent', which means that there is no competition among users for compute time within that account.

You can assign different numbers of shares to the accounts (as I did in my example) and the number of shares doesn't have to align with the number of shares of the parent.  You do need to keep in mind that the number of shares an account has is affected by the total number of shares of all accounts at the same level.  Accounts at the same level can be identified as being indented the same number of spaces and having the same parent.  In my example the following accounts are at the same level with the specified number of shares:
root (user)      0
project1       100
project2       250
user_accounts   50

This means that project1 has shares equal to 1/4 of the cluster compute time, project2 has 5/8, and user_accounts has 1/8.

$ sacctmgr show association tree format=cluster,account,user,share
   Cluster Account                    User     Share 
---------- -------------------- ---------- --------- 
   oceania root                                    1 
   oceania  root                      root         0 
   oceania  project1                             100 
   oceania   project1                user1    parent 
   oceania   project1                user2    parent 
   oceania   project1                user3    parent 
   oceania  project2                             250 
   oceania   project2                user4    parent 
   oceania   project2                user5    parent 
   oceania   project2                user6    parent 
   oceania  user_accounts                         50 
   oceania   a_user1                               1 
   oceania    a_user1                user1         1 
   oceania   a_user2                               1 
   oceania    a_user2                user2         1 
   oceania   a_user3                               1 
   oceania    a_user3                user3         1 
   oceania   a_user4                               1 
   oceania    a_user4                user4         1 
   oceania   a_user5                               1 
   oceania    a_user5                user5         1 
   oceania   a_user6                               1 
   oceania    a_user6                user6         1 


The next thing I did that I wanted to discuss is the fact that the accounts belonging to the users are children of the 'user_accounts' account.  This means that there will be competition among users with their personal accounts.  If one user submits a lot of jobs to their personal account then their fairshare will go down relative to other users who haven't been using their personal accounts.  It also means that the usage of the 'user_accounts' account as a whole will be affected by usage in the other projects.  In my example, if the usage by user_accounts exceeds the 1/8 of the system time then the fairshare values of all users in that account will be lower than the fairshare values of project1 and project2.  There will still be varying fairshare levels among the users based on usage, but the parent account will drop in priority level as a whole.  You can obviously adjust how much usage you want/expect to see in the personal accounts relative to the primary accounts.  An alternative approach would be to have all personal accounts defined at the parent level so that their number of shares directly relates to the total number of shares for the other defined accounts.

I hope this helps give you an idea of how you can expect things to behave.  Please let me know if you have questions or want clarification about anything I mentioned.

Thanks,
Ben
Comment 2 Louis Ekpenyong 2021-10-28 12:29:56 MDT
Hi Ben,
A couple of questions,
-What do we need to do to implement these ?
-How do we implement, and what effect would it have on the cluster ?
Comment 3 Ben Roberts 2021-10-28 14:23:56 MDT
Hi Louis,

Making a change to the account structure could have a significant impact on how the Fairshare values are calculated for users.  If Fairshare is a major part of the overall priority assigned to jobs then users would see a change in behavior as far as how soon their jobs are scheduled compared to jobs from users in other accounts.  There would be other implications from moving accounts around as well.  Each cluster/account/user association is a unique entity and prior usage information is saved for each unique association.  If you move users and accounts around then it will be creating new associations and the usage information would not be carried over from prior accounts that may have the same name.  The process of removing and re-creating accounts and users would also be disruptive if done when the cluster is active.  I'm not trying to convince you not to make these changes, but just to make sure any changes are thoroughly planned and executed in a maintenance window to minimize the disruption to users.

To make these changes you would use sacctmgr.  When you have the hierarchy planned out the way you want it you would add the accounts first and then add any users to the appropriate account.  You would add accounts like this:
sacctmgr modify add account <account name> parent=<parent account name>

To add a user it would look like this:
sacctmgr add user <user name> account=<account name>

If you wanted to modify an existing user, changing the number of shares for example:
sacctmgr modify user <user name> account=<account name> set shares=<number of shares>


You can add things to the commands to create accounts and users, defining things like the number of shares at that time.  There is more information about creating and modifying associations in the documentation that you can find here:
https://slurm.schedmd.com/sacctmgr.html#SECTION_GENERAL-SPECIFICATIONS-FOR-ASSOCIATION-BASED-ENTITIES

I hope this helps.

Thanks,
Ben
Comment 4 Ben Roberts 2021-11-04 13:52:21 MDT
Hi Louis,

Do you have any follow up questions about the changes you are considering making related to Fairshare?  Let me know if there is anything else I can do to help or if this ticket is ok to close.

Thanks,
Ben
Comment 5 Louis Ekpenyong 2021-11-05 13:24:32 MDT
HI Ben,
I am looking at the example you provided . Let me see if we can test and get back to you . Lets keep this open for now .
Comment 6 Ben Roberts 2021-11-05 13:42:49 MDT
Ok, I'll lower the severity of the ticket while you do some testing.

Thanks,
Ben
Comment 7 Louis Ekpenyong 2021-11-30 13:43:28 MST
Hello Ben,
I have some follow up questions .
-Would this change affect running jobs on the cluster ?
-What would be the effect on submitted jobs ?
Comment 8 Ben Roberts 2021-11-30 14:17:15 MST
Hi Louis,

If you changed any settings related to Fairshare it wouldn't affect the jobs that are currently running.  This is because those jobs have already had their priority evaluated and will be allowed to continue running regardless of what their priority would be after any changes.  The exception to this would be if a job was requeued for some reason, then its Fairshare value would contribute to its overall priority and the scheduler would have to take that into account when scheduling the job again.

For any newly submitted jobs or jobs currently in the queue, changes that would affect their Fairshare value would also affect their overall job priority and in turn would affect when they are evaluated by the scheduler compared to other queued jobs.

Thanks,
Ben
Comment 9 Louis Ekpenyong 2021-12-01 12:11:22 MST
Thanks Ben,
It's good to know this wont have any effect on running jobs .
We will continue to test this .
Comment 10 Ben Roberts 2021-12-01 13:22:39 MST
That sounds good.  I'll check in periodically, but let me know if any more questions come up around Fairshare.

Thanks,
Ben
Comment 11 Ben Roberts 2021-12-27 11:37:39 MST
Hi Louis,

How has the testing of fairshare been going?  Let me know if you have any additional questions about this or if the ticket is ok to close.

Thanks,
Ben
Comment 12 Louis Ekpenyong 2021-12-27 11:56:11 MST
Hello Ben,
The plan to implement this change has been dropped . 
Please go ahead and close it .
Thank you .
Comment 13 Ben Roberts 2021-12-27 12:16:16 MST
Ok, feel free to let us know if your plans ever change.

Thanks,
Ben