| Summary: | #SBATCH --hint=nomultithread appears to break "#SBATCH --ntasks-per-node" in Slurm 23.02.1 | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Chris Samuel (NERSC) <csamuel> |
| Component: | User Commands | Assignee: | Marshall Garey <marshall> |
| Status: | RESOLVED FIXED | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | CC: | dmjacobsen |
| Version: | 23.02.1 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| See Also: | https://bugs.schedmd.com/show_bug.cgi?id=10620 | ||
| Site: | NERSC | Slinky Site: | --- |
| Alineos Sites: | --- | Atos/Eviden Sites: | --- |
| Confidential Site: | --- | Coreweave sites: | --- |
| Cray Sites: | --- | DS9 clusters: | --- |
| Google sites: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | NoveTech Sites: | --- |
| Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Tzag Elita Sites: | --- |
| Linux Distro: | --- | Machine Name: | |
| CLE Version: | Version Fixed: | 23.02.3 23.11.0rc1 | |
| Target Release: | --- | DevPrio: | --- |
| Emory-Cloud Sites: | --- | ||
Hi Chris, I can reproduce this. It is only a problem with sbatch. For a workaround, you can set --ntasks in the cli_filter or job_submit plugins: ntasks = ntasks-per-node * nnodes This is fixed in commit c84e7dc2f1 ahead of 23.02.3. I'm closing this as fixed. Let me know if you have any more questions. |
Hi there, A user found that using --hint=nomultithread as an #SBATCH directive only gives them 1 task per node with Slurm 23.02.1 (I reduced to a test case as the original was more complicated): #!/bin/bash #SBATCH --ntasks-per-node=32 #SBATCH -c 1 #SBATCH -t 30 #SBATCH -C cpu #SBATCH -N 6 #SBATCH --hint=nomultithread srun hostname | sort | uniq -c On our Shasta 23.0.2.1 systems it gives: 1 nid001012 1 nid001013 1 nid001014 1 nid001015 1 nid001017 1 nid001018 But that same script run on our XC test system with 22.05.8 gives the expected: 32 nid00056 32 nid00057 32 nid00058 32 nid00059 32 nid00060 32 nid00061 I tested and using `--hint=compute_bound` and `--hint=memory_bound` has the same outcome, but `--hint=multithread` works as desired with Slurm 23.02.1: 32 nid001012 32 nid001013 32 nid001014 32 nid001015 32 nid001017 32 nid001018 All the best, Chris