| Summary: | Tasks on SMT Cores | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Ulf Markwardt <Ulf.markwardt> |
| Component: | slurmctld | Assignee: | Marcin Stolarek <cinek> |
| Status: | RESOLVED FIXED | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | CC: | cinek, jacob, jbooth |
| Version: | 19.05.2 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | -Other- | Slinky Site: | --- |
| Alineos Sites: | --- | Atos/Eviden Sites: | --- |
| Confidential Site: | --- | Coreweave sites: | --- |
| Cray Sites: | --- | DS9 clusters: | --- |
| Google sites: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | NoveTech Sites: | --- |
| Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Tzag Elita Sites: | --- |
| Linux Distro: | --- | Machine Name: | |
| CLE Version: | Version Fixed: | 20.02.2 20.11pre0 | |
| Target Release: | --- | DevPrio: | --- |
| Emory-Cloud Sites: | --- | ||
| Attachments: |
fix assignemnt of INFINITE16 to ntasks_per_node(v1)
fix assignemnt of INFINITE16 to ntasks_per_node(v2) |
||
|
Description
Ulf Markwardt
2019-11-26 05:49:57 MST
Ulf, Is this for a system that is being supported by Atos? If so, could you have Atos submit this issue? Most of this might be fixable through configuration options. Due to contract limitations this is the route we have to take. Jacob Created attachment 12433 [details]
fix assignemnt of INFINITE16 to ntasks_per_node(v1)
Ulf,
I can reproduce it. The issue comes from inconsistent size used by sbatch/srun to internally handle ntasks-per-core, which in case of --hint=multithread defaults to "infinite".
Could you please apply the attached patch and verify if this eliminates the issue for you?
Alternatively, you can explicitly specify --ntasks-per-core (--ntasks-per-core=2 will be enough in this case) or overwrite it in job_submit plugin.
cheers,
Marcin
I am out of office until December 1, 2019. /* For support questions please contact hpcsupport@zih.tu-dresden.de . */ Kind regards, Ulf Markwardt Comment on attachment 12439 [details] fix assignemnt of INFINITE16 to ntasks_per_node(v2) Ulf, Were you able to apply and verify the patch from comment 4 ? cheers, Marcin We have tested it today. The behavior is still the same :-( Best, Ulf We have tested the patch: situation unchanged. With --ntasks-per-core=2 jobs are accepted. Could you please double check if the slurm was fully rebuilt and installed from new build with the patch applied and you're using new sbatch? If yes please execute the following commands: ls -l $(which sbatch) #gdb $(which sbatch) (gdb) break proc_args.c:872 (gdb) run --hint='multithread' --wrap='sleep 100' (gdb) n (gdb) print *ntasks_per_core and share the full output with us. cheers, Marcin Dear Slurm developers, sorry for the long delay. We checked that slurm is rebuilt with the patch. The patch fixes the issue partly only. We see the following behavior: 1. #SBATCH Directive If "#SBATCH --hint=multithread" is specified within a jobfile, the job is rejected with "sbatch: error: Batch job submission failed: Requested node configuration is not available" 2. Commandline Argument Submitting the job via "sbatch --hint=multithread jobfile.sh" works and gives all core (incl. SMT). 3. Env. Variable SLURM_HINT Last, we expirimented with the env. variable SLURM_HINT. While the submission using the combination "unset SLURM_HINT" and "SBATCH --hint=multithread" is rejected, it works with explicitly setting the value of SLURM_HINT via "export SLRUM_HINT=multithread". Best Ulf Hi Ulf, >1. #SBATCH Directive [...] I'm tryint to reproduce it with a script like the one from comment0: ># cat /tmp/testHT >#!/bin/bash >#SBATCH --hint=multithread >#SBATCH -N 1 >#SBATCH --tasks-per-node=128 >srun echo hi using unpatched sbatch: # /mnt/slurm/bin/sbatch /tmp/testHT sbatch: error: Batch job submission failed: Requested node configuration is not available using patched sbatch: # sbatch /tmp/testHT Submitted batch job 122 # grep hi slurm-122.out | wc -l 128 Important slurm.conf parameters on my side: # grep SelectTypePa /mnt/slurm/etc/slurm.conf | grep -v ^# SelectTypeParameters=CR_ONE_TASK_PER_CORE,CR_Core_Memory,CR_CORE_DEFAULT_DIST_BLOCK # grep SelectTy /mnt/slurm/etc/slurm.conf | grep -v ^# SelectType=select/cons_res SelectTypeParameters=CR_ONE_TASK_PER_CORE,CR_Core_Memory,CR_CORE_DEFAULT_DIST_BLOCK # grep NodeName= /mnt/slurm/etc/slurm.conf | grep -v ^# NodeName=test02 NodeHostName=slurmctl CPUs=128 CoreSpecCount=0 Sockets=2 CoresPerSocket=32 ThreadsPerCore=2 State=UNKNOWN Is our configuration and jobscript aligned(parameters and theris order)? Do you have any job_submit or cli_filter plugins potentially affecting job description? >2. Commandline Argument [...] Just to be sure - this looks fine for you? >3. Env. Variable SLURM_HINT It's probably something I didn't fully explain after your initial message. SLURM_* variables are output variables in terms of sbatch and salloc, so when you have a job submitted with sbatch --hint=X SLURM_HINT will be set in the job environment. Those are input variables for srun so srun inside a batch script will "inherit" --hint by default (if it's not unset explicitelly before execution of srun). At the same time all SLURM_* variables are exported to job environemnt, so if you export SLURM_HINT srun will get it even if sbatch was called without --hint - this is happening in this case. When srun is called inside a job allocation it will create a step in this allocation, so any option here can't have impact on selection of cores (i.e. slurmctld select plugin activity), but can change TaskPlugin behavior - task affinity. In those terms --hint is a little bit special since depending on the context it affects both or only task affinity, what it does for sbatch/salloc is: if both --ntasks-per-core and --threads-per-core are not specified but --hint=multithreaded is used set --ntasks-per-core=infinite and set SLURM_HINT output variable. This variable will be interpreted by srun which will result in --cpu-bind=threads and removal of CR_ONE_TASK_PER_CORE. Are you mixing --hint is that possible that you're mixing --threads-per-core/--ntasks-per-core in your job script from point 1? cheers, Marcin Ulf, The patch for the issue was merged[1] into slurm-20.02 branch and will be part of 20.02.2 release. I'm closing this now. Should you have any question please reopen. cheers, Marcin [1]https://github.com/SchedMD/slurm/commit/e5d9b71bebbeea956997cebd01bf693a1b294b62 |