| Summary: | Root user start job from queue bypassing accounts and queue limits | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Jay McGlothlin <mcglow2> |
| Component: | Limits | Assignee: | Director of Support <support> |
| Status: | RESOLVED TIMEDOUT | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | ||
| Version: | - Unsupported Older Versions | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | RPI/CCNI - Rensselaer Polytechnic Institute | Slinky Site: | --- |
| Alineos Sites: | --- | Atos/Eviden Sites: | --- |
| Confidential Site: | --- | Coreweave sites: | --- |
| Cray Sites: | --- | DS9 clusters: | --- |
| Google sites: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | NoveTech Sites: | --- |
| Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Tzag Elita Sites: | --- |
| Linux Distro: | --- | Machine Name: | |
| CLE Version: | Version Fixed: | ||
| Target Release: | --- | DevPrio: | --- |
| Emory-Cloud Sites: | --- | ||
|
Description
Jay McGlothlin
2023-01-05 08:40:19 MST
Jay, Specifically for what you want, you need to override the limit at the user level. Once the limit is changed, the scheduler will evaluate the jobs again but check the new limits and allow more jobs to run. This will take a several seconds to happen once the limit is changed. Example: sacctmgr modify user where name=<username> set maxjobs=<more jobs> Then once more are scheduled, change it back. sacctmgr modify user where name=<username> set maxjobs=<old amount> There may be other limits that you could mix and match instead to avoid needing to manually override like this while still preventing total cluster usage: https://slurm.schedmd.com/resource_limits.html But this manual override I suggested will allow more jobs to run for that user until you change it back, like you requested. When I raised the limit and saw in squeue that more got scheduled, I immediately lowered the limit again (so I didn't have to wait for all of the jobs to finish to lower that limit). I saw no issues handling it like this. Does this answer your question? Caden Do you have an update for me on this? Feel free to open this back up if you have further questions. Caden |