| Summary: | Slurm 20.11.4 support for Forge 19 | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Tony Racho <antonio-ii.racho> |
| Component: | Other | Assignee: | Marcin Stolarek <cinek> |
| Status: | RESOLVED INFOGIVEN | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | CC: | antonio-ii.racho |
| Version: | 20.11.4 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | CRAY | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | NIWA/WELLINGTON |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | Target Release: | --- | |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
|
Description
Tony Racho
2021-05-16 17:58:35 MDT
Tony,
Before we'll jump into more detailed debugging, could you please check if addition of
>export SLURM_OVERLAP=1
before the execution of the other commands will change the behavior?
cheers,
Marcin
Marcin: That worked. Thanks, Tony Tonny, I'm guessing that the issue was that `ddt` (probably) calls srun behind the scene and one of the major changes in Slurm 20.11 was that we don't overlap step resources by default. If you can verify how ddt works (one of the options is to use strace and check if srun is executed) this will help us to understand the case. Another thing to check here is to make sure if you need srun in `ddt` arguments, maybe just `ddt ls` will work - if srun is already called by ddt. You may want to check Bug 11341 for more details - from the 20.11 change perspective it may be called a duplicate, but I'd like to make sure that we fully understood what's happening in your case to have a best long term solution for you. Exporting SLURM_OVERLAP is something I'd call a workaround for now. Let me know your thoughts. cheers, Marcin Will check this out. Thanks, Tony Tony, Were you able to check the details? cheers, Marcin Hi Marcin: Apologies. Was distracted on other stuff lately. Will find that out and update. Cheers, Tony Tony, Were you able to get back to the case? cheers, Marcin Tony, Let me know if you want to continue working on this. In case of no reply I'll close the bug as info given. cheers, Marcin Hi Marcin: Apologies. A bit busy at the site at the moment due to hardware expansions. Please close the ticket. Much appreciated. Cheers, Tony |