| Summary: | T102812072 & T102812005 Fairtree implementation" asking SchedMD for guidance on enabling this on our cluster | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Louis Ekpenyong <lekpenyong> |
| Component: | Scheduling | Assignee: | Ben Roberts <ben> |
| Status: | RESOLVED INFOGIVEN | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | CC: | calebh, kwhetham |
| Version: | 20.11.8 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | FB (PSLA) | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | Target Release: | --- | |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
Hi Louis, What you are describing sounds like it should be feasible. There are some subtleties with how accounts interact with other accounts that might affect how you want to define the tree in your environment. I've put together an example of how you might do things based on your description. I created two main accounts (project1 and project2) and put 3 users in each of these primary accounts. I gave each of the users in these accounts a 'Share' value of 'parent', which means that there is no competition among users for compute time within that account. You can assign different numbers of shares to the accounts (as I did in my example) and the number of shares doesn't have to align with the number of shares of the parent. You do need to keep in mind that the number of shares an account has is affected by the total number of shares of all accounts at the same level. Accounts at the same level can be identified as being indented the same number of spaces and having the same parent. In my example the following accounts are at the same level with the specified number of shares: root (user) 0 project1 100 project2 250 user_accounts 50 This means that project1 has shares equal to 1/4 of the cluster compute time, project2 has 5/8, and user_accounts has 1/8. $ sacctmgr show association tree format=cluster,account,user,share Cluster Account User Share ---------- -------------------- ---------- --------- oceania root 1 oceania root root 0 oceania project1 100 oceania project1 user1 parent oceania project1 user2 parent oceania project1 user3 parent oceania project2 250 oceania project2 user4 parent oceania project2 user5 parent oceania project2 user6 parent oceania user_accounts 50 oceania a_user1 1 oceania a_user1 user1 1 oceania a_user2 1 oceania a_user2 user2 1 oceania a_user3 1 oceania a_user3 user3 1 oceania a_user4 1 oceania a_user4 user4 1 oceania a_user5 1 oceania a_user5 user5 1 oceania a_user6 1 oceania a_user6 user6 1 The next thing I did that I wanted to discuss is the fact that the accounts belonging to the users are children of the 'user_accounts' account. This means that there will be competition among users with their personal accounts. If one user submits a lot of jobs to their personal account then their fairshare will go down relative to other users who haven't been using their personal accounts. It also means that the usage of the 'user_accounts' account as a whole will be affected by usage in the other projects. In my example, if the usage by user_accounts exceeds the 1/8 of the system time then the fairshare values of all users in that account will be lower than the fairshare values of project1 and project2. There will still be varying fairshare levels among the users based on usage, but the parent account will drop in priority level as a whole. You can obviously adjust how much usage you want/expect to see in the personal accounts relative to the primary accounts. An alternative approach would be to have all personal accounts defined at the parent level so that their number of shares directly relates to the total number of shares for the other defined accounts. I hope this helps give you an idea of how you can expect things to behave. Please let me know if you have questions or want clarification about anything I mentioned. Thanks, Ben Hi Ben, A couple of questions, -What do we need to do to implement these ? -How do we implement, and what effect would it have on the cluster ? Hi Louis, Making a change to the account structure could have a significant impact on how the Fairshare values are calculated for users. If Fairshare is a major part of the overall priority assigned to jobs then users would see a change in behavior as far as how soon their jobs are scheduled compared to jobs from users in other accounts. There would be other implications from moving accounts around as well. Each cluster/account/user association is a unique entity and prior usage information is saved for each unique association. If you move users and accounts around then it will be creating new associations and the usage information would not be carried over from prior accounts that may have the same name. The process of removing and re-creating accounts and users would also be disruptive if done when the cluster is active. I'm not trying to convince you not to make these changes, but just to make sure any changes are thoroughly planned and executed in a maintenance window to minimize the disruption to users. To make these changes you would use sacctmgr. When you have the hierarchy planned out the way you want it you would add the accounts first and then add any users to the appropriate account. You would add accounts like this: sacctmgr modify add account <account name> parent=<parent account name> To add a user it would look like this: sacctmgr add user <user name> account=<account name> If you wanted to modify an existing user, changing the number of shares for example: sacctmgr modify user <user name> account=<account name> set shares=<number of shares> You can add things to the commands to create accounts and users, defining things like the number of shares at that time. There is more information about creating and modifying associations in the documentation that you can find here: https://slurm.schedmd.com/sacctmgr.html#SECTION_GENERAL-SPECIFICATIONS-FOR-ASSOCIATION-BASED-ENTITIES I hope this helps. Thanks, Ben Hi Louis, Do you have any follow up questions about the changes you are considering making related to Fairshare? Let me know if there is anything else I can do to help or if this ticket is ok to close. Thanks, Ben HI Ben, I am looking at the example you provided . Let me see if we can test and get back to you . Lets keep this open for now . Ok, I'll lower the severity of the ticket while you do some testing. Thanks, Ben Hello Ben, I have some follow up questions . -Would this change affect running jobs on the cluster ? -What would be the effect on submitted jobs ? Hi Louis, If you changed any settings related to Fairshare it wouldn't affect the jobs that are currently running. This is because those jobs have already had their priority evaluated and will be allowed to continue running regardless of what their priority would be after any changes. The exception to this would be if a job was requeued for some reason, then its Fairshare value would contribute to its overall priority and the scheduler would have to take that into account when scheduling the job again. For any newly submitted jobs or jobs currently in the queue, changes that would affect their Fairshare value would also affect their overall job priority and in turn would affect when they are evaluated by the scheduler compared to other queued jobs. Thanks, Ben Thanks Ben, It's good to know this wont have any effect on running jobs . We will continue to test this . That sounds good. I'll check in periodically, but let me know if any more questions come up around Fairshare. Thanks, Ben Hi Louis, How has the testing of fairshare been going? Let me know if you have any additional questions about this or if the ticket is ok to close. Thanks, Ben Hello Ben, The plan to implement this change has been dropped . Please go ahead and close it . Thank you . Ok, feel free to let us know if your plans ever change. Thanks, Ben |
The goal of this RFC is to evaluate the technical feasibility of using fairtree; which allows users to pool their karma (fairshare), and instead of deducting karma for individual users, allow them to run job from a shared account that spreads karma usage across all team members. Proposed implementation for the FAIR Accel is as follows: Enable both user-level and project level accounts that for each partition that exists for FAIR Accel Karma (fairshare) values are assigned at both the user and project level; where the project-level karma value is the aggregate of the karma values of the individual team members Users can specify which accounts to use when running jobs user account - karma values is deducted from the users own quota project/team account - deduction of karma is evenly spread across all the users on the team, thereby avoiding the scenario where individual users are penalized Proposed implementation for the FAIR Labs is as follows: FAIR Labs have shorter life-cycles when compared to FAIR Accel To support this working model, users must be able to spin up account (and consequently, karma) sharing in an on-demand fashion. Proposed implementation: Individuals in Labs have the following their own account and coordinator/operator privileges.