| Summary: | Dynamic bf_interval feature | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | CSC sysadmins <csc-slurm-tickets> |
| Component: | Scheduling | Assignee: | Felip Moll <felip.moll> |
| Status: | RESOLVED INFOGIVEN | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | CC: | dmjacobsen, felip.moll |
| Version: | 16.05.6 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | CSC - IT Center for Science | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | Target Release: | --- | |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
|
Description
CSC sysadmins
2016-11-03 02:28:42 MDT
(In reply to Tommi Tervo from comment #0) > Hi, > > I really hate that every now and then I need manually tune backfill > parameters depending on the queue depth. Slurm knows internally (sdiag) how > long generally backfill loop takes and it could adjust it shorter or longer > depending on the load. > Something like bf_dynamic_interval_window=20-1200 If I understand you correctly, you'd like the backfill scheduler to always run a complete pass, and then immediately start again once that pass has completed? Yes, backfill could run continuously. I'm remarking this as a possible enhancement for 17.11 to add a flag to keep the backfill scheduler running continually. See bug 3808 for a possible (user contributed) implementation that is similar to the functionality discussed here. Hi Tommi, This enhancement has been sitting there for a long time. What Doug suggested in bug 3808 would solve the commented problem here. The idea is to set bf_interval = Time between backfill cycles bf_max_time = Max time spent in the backfill including sleeps. Setting a large bf_max_time will help to reach the queue. To avoid responsiveness problem, setting max_rpc_cnt is advised. After you confirm there's not any objection on your side, I will close this bug. Hi, I'll close this bug |