Created attachment 27173 [details] slurm.conf Hi SchedMD, we have a cluster that is a mix of cpu-only nodes and gpu nodes. We run with pack_serial_at_end and it seems to work fine for the cpu-only nodes, but the jobs spread out for cpu-only jobs submitted to the gpu nodes. I tested this by submitting 8 jobs in quick succession to a gpu partition that only sleep and print the date: [renata@sdf-login03 mpi]$ squeue -u renata JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 5875326 neutrino run.n.sh renata PD 0:00 1 (Resources) 5875319 neutrino run.n.sh renata R 0:11 1 tur026 5875320 neutrino run.n.sh renata R 0:11 1 tur026 5875321 neutrino run.n.sh renata R 0:11 1 tur026 5875322 neutrino run.n.sh renata R 0:11 1 tur025 5875323 neutrino run.n.sh renata R 0:11 1 tur022 5875324 neutrino run.n.sh renata R 0:11 1 tur022 5875325 neutrino run.n.sh renata R 0:11 1 tur022 [renata@sdf-login03 mpi]$ squeue -w tur022 -o "%.18i %.9P %.8j %.8u %.2t %.10M %.6D %R %c %m" JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) MIN_CPUS MIN_MEMORY 5875323 neutrino run.n.sh renata R 0:32 1 tur022 1 4000M 5875324 neutrino run.n.sh renata R 0:32 1 tur022 1 4000M 5875325 neutrino run.n.sh renata R 0:32 1 tur022 1 4000M 5874771 neutrino reviewED zhulcher R 1:29:04 1 tur022 1 20G 5874461 neutrino reviewED zhulcher R 2:25:24 1 tur022 1 20G 5874291 neutrino reviewED zhulcher R 2:54:34 1 tur022 1 20G 5873579 neutrino reviewED zhulcher R 5:06:26 1 tur022 1 20G [renata@sdf-login03 mpi]$ squeue -w tur025 -o "%.18i %.9P %.8j %.8u %.2t %.10M %.6D %R %c %m" JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) MIN_CPUS MIN_MEMORY 5861593 cryoem sys/dash rkretsch R 1-05:43:57 1 tur025 44 125G 5875322 neutrino run.n.sh renata R 0:40 1 tur025 1 4000M 5873866 neutrino reviewED zhulcher R 4:11:10 1 tur025 1 20G [renata@sdf-login03 mpi]$ squeue -w tur026 -o "%.18i %.9P %.8j %.8u %.2t %.10M %.6D %R %c %m" JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) MIN_CPUS MIN_MEMORY 5875319 neutrino run.n.sh renata R 0:48 1 tur026 1 4000M 5875320 neutrino run.n.sh renata R 0:48 1 tur026 1 4000M 5875321 neutrino run.n.sh renata R 0:48 1 tur026 1 4000M 5874912 neutrino reviewED zhulcher R 1:04:11 1 tur026 1 20G 5874737 neutrino reviewED zhulcher R 1:37:23 1 tur026 1 20G 5874681 neutrino reviewED zhulcher R 1:45:25 1 tur026 1 20G 5875256 neutrino reviewED zhulcher R 9:51 1 tur026 1 20G The tur nodes look like this: NodeName=tur[000-026] CPUs=48 RealMemory=191552 Sockets=2 CoresPerSocket=12 ThreadsPerCore=2 Gres=gpu:geforce_rtx_2080_ti:10 Features=CPU_GEN:SKX,CPU_SKU:5118,CPU_FRQ:2.30GHz,GPU_GEN:TUR,GPU_SKU:RTX2080TI,GPU_MEM:11GB,GPU_CC:7.5 Weight=56117 State=UNKNOWN I have included our slurm.conf. Thanks, Renata
Renata, Can I get your gres.conf for the node in question as well? -Scott
Hi Scott, this is the gres.conf: ################################################################################ ## slurm gres conf ################################################################################ #AutoDetect=nvml ### # hep nodes ### #NodeName=hep-gpu01 Name=gpu Type=geforce_gtx_1080_ti Count=8 File=/dev/nvidia[0,2-4,6-9] #NodeName=hep-gpu01 Name=gpu Type=titan_xp Count=2 File=/dev/nvidia[1,5] ### # pascal 1080ti nodes ### NodeName=psc[000-009] Name=gpu Type=geforce_gtx_1080_ti Count=10 File=/dev/nvidia[0-9] ### # turing 2080 nodes ### NodeName=tur[000-026] Name=gpu Type=geforce_rtx_2080_ti Count=10 File=/dev/nvidia[0-9] ### # volta v100 nodes ### NodeName=volt[000-005] Name=gpu Type=v100 Count=4 File=/dev/nvidia[0-3] ### # ampere a100 nodes ### NodeName=ampt[000-020] Name=gpu Type=a100 Count=4 File=/dev/nvidia[0-3] Reanta On Fri, 7 Oct 2022, bugs@schedmd.com wrote: >https://bugs.schedmd.com/show_bug.cgi?id=15120 > >Scott Hilton <scott@schedmd.com> changed: > > What |Removed |Added >---------------------------------------------------------------------------- > CC| |scott@schedmd.com > >--- Comment #1 from Scott Hilton <scott@schedmd.com> --- >Renata, > >Can I get your gres.conf for the node in question as well? > >-Scott > >-- >You are receiving this mail because: >You reported the bug.
Hi Scott, just wondering if there is any update on this issue? Thanks, Renata
Renata, Sorry for the delay, I still need more information to understand what is going on. Please send the command and script you used to launch gpu jobs. Also send the output of this command: >sacct -po jobid,nodelist,reqtres,alloctres -j 5875319,5875320,5875321,5875322,5875323,5875324,5875325,5875326,5874912,5874737,5874681,5875256,5861593,5873866 -Scott
Hi Scott, here is the output of the command requested: [renata@sdf-login03 bin]$ sacct -po jobid,nodelist,reqtres,alloctres -j 5875319,5875320,5875321,5875322,5875323,5875324,5875325,5875326,5874912,5874737,5874681,5875256,5861593,5873866 JobID|NodeList|ReqTRES|AllocTRES| 5861593|tur025|billing=44,cpu=44,gres/gpu:geforce_rtx_2080_ti=8,gres/gpu=8,mem=125G,node=1|billing=44,cpu=44,gres/gpu:geforce_rtx_2080_ti=8,gres/gpu=8,mem=125G,node=1| 5861593.batch|tur025||cpu=44,gres/gpu:geforce_rtx_2080_ti=8,gres/gpu=8,mem=125G,node=1| 5861593.extern|tur025||billing=44,cpu=44,gres/gpu:geforce_rtx_2080_ti=8,gres/gpu=8,mem=125G,node=1| 5873866|tur025|billing=1,cpu=1,mem=20G,node=1|billing=2,cpu=2,mem=40G,node=1| 5873866.batch|tur025||cpu=2,mem=40G,node=1| 5873866.extern|tur025||billing=2,cpu=2,mem=40G,node=1| 5874681|tur026|billing=1,cpu=1,mem=20G,node=1|billing=2,cpu=2,mem=40G,node=1| 5874681.batch|tur026||cpu=2,mem=40G,node=1| 5874681.extern|tur026||billing=2,cpu=2,mem=40G,node=1| 5874737|tur026|billing=1,cpu=1,mem=20G,node=1|billing=2,cpu=2,mem=40G,node=1| 5874737.batch|tur026||cpu=2,mem=40G,node=1| 5874737.extern|tur026||billing=2,cpu=2,mem=40G,node=1| 5874912|tur026|billing=1,cpu=1,mem=20G,node=1|billing=2,cpu=2,mem=40G,node=1| 5874912.batch|tur026||cpu=2,mem=40G,node=1| 5874912.extern|tur026||billing=2,cpu=2,mem=40G,node=1| 5875256|tur026|billing=1,cpu=1,mem=20G,node=1|billing=2,cpu=2,mem=40G,node=1| 5875256.batch|tur026||cpu=2,mem=40G,node=1| 5875256.extern|tur026||billing=2,cpu=2,mem=40G,node=1| 5875319|tur026|billing=1,cpu=1,mem=4000M,node=1|billing=2,cpu=2,mem=8000M,node=1| 5875319.batch|tur026||cpu=2,mem=8000M,node=1| 5875319.extern|tur026||billing=2,cpu=2,mem=8000M,node=1| 5875320|tur026|billing=1,cpu=1,mem=4000M,node=1|billing=2,cpu=2,mem=8000M,node=1| 5875320.batch|tur026||cpu=2,mem=8000M,node=1| 5875320.extern|tur026||billing=2,cpu=2,mem=8000M,node=1| 5875321|tur026|billing=1,cpu=1,mem=4000M,node=1|billing=2,cpu=2,mem=8000M,node=1| 5875321.batch|tur026||cpu=2,mem=8000M,node=1| 5875321.extern|tur026||billing=2,cpu=2,mem=8000M,node=1| 5875322|tur025|billing=1,cpu=1,mem=4000M,node=1|billing=2,cpu=2,mem=8000M,node=1| 5875322.batch|tur025||cpu=2,mem=8000M,node=1| 5875322.extern|tur025||billing=2,cpu=2,mem=8000M,node=1| 5875323|tur022|billing=1,cpu=1,mem=4000M,node=1|billing=2,cpu=2,mem=8000M,node=1| 5875323.batch|tur022||cpu=2,mem=8000M,node=1| 5875323.extern|tur022||billing=2,cpu=2,mem=8000M,node=1| 5875324|tur022|billing=1,cpu=1,mem=4000M,node=1|billing=2,cpu=2,mem=8000M,node=1| 5875324.batch|tur022||cpu=2,mem=8000M,node=1| 5875324.extern|tur022||billing=2,cpu=2,mem=8000M,node=1| 5875325|tur022|billing=1,cpu=1,mem=4000M,node=1|billing=2,cpu=2,mem=8000M,node=1| 5875325.batch|tur022||cpu=2,mem=8000M,node=1| 5875325.extern|tur022||billing=2,cpu=2,mem=8000M,node=1| 5875326|tur026|billing=1,cpu=1,mem=4000M,node=1|billing=2,cpu=2,mem=8000M,node=1| 5875326.batch|tur026||cpu=2,mem=8000M,node=1| 5875326.extern|tur026||billing=2,cpu=2,mem=8000M,node=1| I'll have to investigate what command/script may have been used. Renata On Tue, 11 Oct 2022, bugs@schedmd.com wrote: >https://bugs.schedmd.com/show_bug.cgi?id=15120 > >--- Comment #4 from Scott Hilton <scott@schedmd.com> --- >Renata, > >Sorry for the delay, I still need more information to understand what is going >on. > >Please send the command and script you used to launch gpu jobs. > >Also send the output of this command: >>sacct -po jobid,nodelist,reqtres,alloctres -j 5875319,5875320,5875321,5875322,5875323,5875324,5875325,5875326,5874912,5874737,5874681,5875256,5861593,5873866 > >-Scott > >-- >You are receiving this mail because: >You reported the bug.
Hi Scott, actually I was submitting non-gpu jobs to the neutrino partition as a test because zhulcher was doing the same and it was noticed that his jobs were spreading rather than packing. My jobs is simple: #!/bin/sh #SBATCH --partition=neutrino #SBATCH --ntasks-per-node=1 # sleep 60 date Thanks, Renata
Hi again, just to be clear, the neutrino partition only has gpu hosts, but they also have cpu available on them, so some users like zhulcher send cpu-only jobs there. Renata
Renata, It looks like tur026 ran out of memory according to AllocTRES. (4 jobs using 40GB and 3 jobs using 8GB) So the algorithm started scheduling on tur025 until it ran out of cpus. Because 5861593 was using 44/48 cpus there was only room for two single core jobs. I would presume tur024 and tur023 were also busy because the rest were scheduled on tur022. This seems to be behaving properly to me. pack_serial_at_end means that slurm will start at the end of the node list and works its way down. This seems to be tur026 in this case. Let me know if you have any questions or if I am misunderstanding the issue. -Scott
Hi Scott, thanks for the pointer to check both reqtres and alloctres. I didn't realize that hyperthreading appears to be turned on on the gpu nodes. It looks like packing is working just fine once I check alloctres against what I thought I submitted. I'll pass this back to the admin who manages the gpu nodes. Thanks, Renata On Tue, 11 Oct 2022, bugs@schedmd.com wrote: >https://bugs.schedmd.com/show_bug.cgi?id=15120 > >--- Comment #8 from Scott Hilton <scott@schedmd.com> --- >Renata, > >It looks like tur026 ran out of memory according to AllocTRES. (4 jobs using >40GB and 3 jobs using 8GB) > >So the algorithm started scheduling on tur025 until it ran out of cpus. Because >5861593 was using 44/48 cpus there was only room for two single core jobs. > >I would presume tur024 and tur023 were also busy because the rest were >scheduled on tur022. > >This seems to be behaving properly to me. pack_serial_at_end means that slurm >will start at the end of the node list and works its way down. This seems to be >tur026 in this case. > >Let me know if you have any questions or if I am misunderstanding the issue. > >-Scott > >-- >You are receiving this mail because: >You reported the bug.
Hi again Scott, what is the way to request one core and 20GB memory in this situation? Renata On Tue, 11 Oct 2022, bugs@schedmd.com wrote: >https://bugs.schedmd.com/show_bug.cgi?id=15120 > >--- Comment #8 from Scott Hilton <scott@schedmd.com> --- >Renata, > >It looks like tur026 ran out of memory according to AllocTRES. (4 jobs using >40GB and 3 jobs using 8GB) > >So the algorithm started scheduling on tur025 until it ran out of cpus. Because >5861593 was using 44/48 cpus there was only room for two single core jobs. > >I would presume tur024 and tur023 were also busy because the rest were >scheduled on tur022. > >This seems to be behaving properly to me. pack_serial_at_end means that slurm >will start at the end of the node list and works its way down. This seems to be >tur026 in this case. > >Let me know if you have any questions or if I am misunderstanding the issue. > >-Scott > >-- >You are receiving this mail because: >You reported the bug.
Hi Scott, in case it is needed: [renata@sdf-login03 mpi]$ sinfo -p neutrino -o %z S:C:T 2:8+:2 Renata On Tue, 11 Oct 2022, bugs@schedmd.com wrote: >https://bugs.schedmd.com/show_bug.cgi?id=15120 > >--- Comment #8 from Scott Hilton <scott@schedmd.com> --- >Renata, > >It looks like tur026 ran out of memory according to AllocTRES. (4 jobs using >40GB and 3 jobs using 8GB) > >So the algorithm started scheduling on tur025 until it ran out of cpus. Because >5861593 was using 44/48 cpus there was only room for two single core jobs. > >I would presume tur024 and tur023 were also busy because the rest were >scheduled on tur022. > >This seems to be behaving properly to me. pack_serial_at_end means that slurm >will start at the end of the node list and works its way down. This seems to be >tur026 in this case. > >Let me know if you have any questions or if I am misunderstanding the issue. > >-Scott > >-- >You are receiving this mail because: >You reported the bug.
Renata, This should do it: >srun --mem=20G hostname Because you use CR_Core_Memory you will always get allocations of whole cores. This would only give you 1 core as well because each core has 2 threads or "cpus": >srun --mem=20G -n2 hostname If you want 1 core per task you could use -c2: >srun --mem=20G -n2 -c2 hostname If you want 20G per core you could do this: >srun --mem-per-cpu=10G hostname -Scott
Hi Scott, it looks like I cannot get just 1 cpu allocated: [renata@sdf-login03 mpi]$ cat testcpu.sh #!/bin/sh #SBATCH --partition=neutrino #SBATCH --mem=4G #SBATCH --ntasks=1 #SBATCH --cpus-per-task=1 # sleep 60 date [renata@sdf-login03 mpi]$ [renata@sdf-login03 mpi]$ [renata@sdf-login03 mpi]$ [renata@sdf-login03 mpi]$ sbatch testcpu.sh Submitted batch job 5949412 [renata@sdf-login03 mpi]$ [renata@sdf-login03 mpi]$ [renata@sdf-login03 mpi]$ sacct -Xpo jobid,user,nodelist,reqtres,alloctres -j 5949412 JobID|User|NodeList|ReqTRES|AllocTRES| 5949412|renata|tur026|billing=1,cpu=1,mem=4G,node=1|billing=2,cpu=2,mem=4G,node=1| Since we have DefMemPerCpu=4000, it looks like specifying mem=4G does restrict the memory. Without specifying mem=4G, it wants to allocate 8, I guess because it cannot give me 1 core? Renata On Wed, 12 Oct 2022, bugs@schedmd.com wrote: >https://bugs.schedmd.com/show_bug.cgi?id=15120 > >--- Comment #12 from Scott Hilton <scott@schedmd.com> --- >Renata, > >This should do it: >>srun --mem=20G hostname >Because you use CR_Core_Memory you will always get allocations of whole cores. >This would only give you 1 core as well because each core has 2 threads or >"cpus": >>srun --mem=20G -n2 hostname >If you want 1 core per task you could use -c2: >>srun --mem=20G -n2 -c2 hostname >If you want 20G per core you could do this: >>srun --mem-per-cpu=10G hostname >-Scott > >-- >You are receiving this mail because: >You reported the bug.
Renata, You cannot allocate a single cpu because you use CR_Core_Memory. You will always get allocations of whole cores which means 2,4,6,etc. cpus. In this case, "cpu" means thread. Each core has 2 threads due to hyperthreading. Yes, since you get 2 cpus by default you will by default get 8000 MB when DefMemPerCpu=4000. -Scott
Hi Scott, if we switched to SelectTypeParameters=CR_ONE_TASK_PER_CORE keeping SelectType=select/cons_tres DefMemPerCpu=4000 would that then provide 1 cpu for jobs submitted to the hyperthreaded hosts that request 1 cpu? And if so would gpu jobs see a change in their default cpu allocation, or jobs submitted to cpu-only hosts? Thanks, Renata On Wed, 12 Oct 2022, bugs@schedmd.com wrote: >https://bugs.schedmd.com/show_bug.cgi?id=15120 > >--- Comment #14 from Scott Hilton <scott@schedmd.com> --- >Renata, > >You cannot allocate a single cpu because you use CR_Core_Memory. You will >always get allocations of whole cores which means 2,4,6,etc. cpus. In this >case, "cpu" means thread. Each core has 2 threads due to hyperthreading. > >Yes, since you get 2 cpus by default you will by default get 8000 MB when >DefMemPerCpu=4000. > >-Scott > >-- >You are receiving this mail because: >You reported the bug.
Renata, No, CR_ONE_TASK_PER_CORE means that each task will request a core. With hyperthreading that means that a job with 4 tasks will get 4 cores which is (8 cpus/threads). Here is an example from my test cluster: >$ srun -n4 hostname >$ sacct --start=now-20minutes -o jobid,nodelist,reqtres%40,alloctres%40 >JobID NodeList ReqTRES AllocTRES >------------ --------------- ---------------------------------------- ---------------------------------------- >1957 node0 billing=4,cpu=4,mem=400M,node=1 billing=8,cpu=8,mem=800M,node=1 >1957.0 node0 cpu=8,mem=800M,node=1 Also, the proper config for CR_ONE_TASK_PER_CORE is this: >SelectTypeParameters=CR_Core_Memory,CR_ONE_TASK_PER_CORE -Scott
Hi Scott, thanks for that clarification. Is there any change I could make to change the behavior so a user could be allocated just one core on the hyperthreaded systems? I guess I could change the entry for the NodeName in slurm.conf from ThreadsPerCore=2 to ThreadsPerCore=1 but is there any other way? Just want to make sure I understand all of our options. Thanks, Renata On Wed, 12 Oct 2022, bugs@schedmd.com wrote: >https://bugs.schedmd.com/show_bug.cgi?id=15120 > >--- Comment #16 from Scott Hilton <scott@schedmd.com> --- >Renata, > >No, CR_ONE_TASK_PER_CORE means that each task will request a core. With >hyperthreading that means that a job with 4 tasks will get 4 cores which is (8 >cpus/threads). >Here is an example from my test cluster: >>$ srun -n4 hostname >>$ sacct --start=now-20minutes -o jobid,nodelist,reqtres%40,alloctres%40 >>JobID NodeList ReqTRES AllocTRES >>------------ --------------- ---------------------------------------- ---------------------------------------- >>1957 node0 billing=4,cpu=4,mem=400M,node=1 billing=8,cpu=8,mem=800M,node=1 >>1957.0 node0 cpu=8,mem=800M,node=1 >Also, the proper config for CR_ONE_TASK_PER_CORE is this: >>SelectTypeParameters=CR_Core_Memory,CR_ONE_TASK_PER_CORE > >-Scott > >-- >You are receiving this mail because: >You reported the bug.
Hi Scott, I am a bit confused about this. Here is the users sbatch script: #SBATCH --job-name=reviewEDEP #SBATCH --output=output/output-%j.txt #SBATCH --error=error/error-%j.txt #SBATCH --partition=neutrino #SBATCH --ntasks=1 #SBATCH --cpus-per-task=1 #SBATCH --mem-per-cpu=20g #SBATCH --time=3:00:00 #SBATCH --exclude=ampt010 He has a few cpu jobs running now on different types of gpu hosts all of which are hyperthreaded. This shows that hyperthreading is turned on for an ampt and a tur node: renata@sdf-login03 test]$ ssh ampt020 cat /sys/devices/system/cpu/smt/active 1 [renata@sdf-login03 test]$ ssh tur024 cat /sys/devices/system/cpu/smt/active 1 These are the entries in slurm.conf for them: NodeName=ampt[000-020] CPUs=128 RealMemory=1029344 Sockets=2 CoresPerSocket=64 ThreadsPerCore=2 Gres=gpu:a100:4 NodeName=tur[000-026] CPUs=48 RealMemory=191552 Sockets=2 CoresPerSocket=12 ThreadsPerCore=2 Gres=gpu:geforce_rtx_2080_ti:10 But the job running on the ampt node registers an allocation of just 1 cpu and 20G of memory: [renata@sdf-login03 test]$ sacct -Xpo jobid,user,nodelist,reqtres,alloctres -j 5956961 JobID|User|NodeList|ReqTRES|AllocTRES| 5956961|zhulcher|ampt020|billing=1,cpu=1,mem=20G,node=1|billing=1,cpu=1,mem=20G,node=1| while the one on the tur node shows the double allocation: [renata@sdf-login03 test]$ sacct -Xpo jobid,user,nodelist,reqtres,alloctres -j 5956762 JobID|User|NodeList|ReqTRES|AllocTRES| 5956762|zhulcher|tur024|billing=1,cpu=1,mem=20G,node=1|billing=2,cpu=2,mem=40G,node=1| Renata
(In reply to Renata Dart from comment #18) > But the job running on the ampt node registers an allocation of just > 1 cpu and 20G of memory: > >... > >while the one on the tur node shows the double allocation: The ampt nodes has a "CPU" count equal to the core count 2*64=128. (Sockets * CoresPerSocket = CPUS) >NodeName=ampt[000-020] CPUs=128 RealMemory=1029344 Sockets=2 CoresPerSocket=64 ThreadsPerCore=2 ... The tur nodes have a "CPU" count equal to the thread count 2*12*2=48. (Sockets * CoresPerSocket * ThreadsPerCore = CPUS) >NodeName=tur[000-026] CPUs=48 RealMemory=191552 Sockets=2 CoresPerSocket=12 ThreadsPerCore=2 ,,, Slurm supports both situations. In the first a "CPU" is a core in the second a "CPU is a thread. See the documentation: https://slurm.schedmd.com/slurm.conf.html#OPT_CPUs -Scott
Renata, What is your use case for this setup? Are you saying you want jobs to be allowed to be allocated just a 1 thread on a core instead of both? i.e. allocating half a core at a time. Do your users want to run multithreaded jobs or non-multithreaded jobs choose with each job? Do you want the "CPU" count to match the core count for accounting purposes? -Scott
Hi Scott, aha, I see. Let's say that the admin who defined the gpu host entries in slurm.conf made a mistake and really wanted the tur nodes to be set up like the ampt nodes (I don't know if that is the case but want to be prepared in case that is what happened). Would changing the tur nodes to be CoresPerSocket=24 have an impact on the gpu job submissions? That is, would the gpu users see any difference in the way their jobs were allocated resources? And in order to change the tur node entry in slurm.conf could I just make that change and then scontrol reconfig? Or would I need to restart all of the slurmds? Renata On Thu, 13 Oct 2022, bugs@schedmd.com wrote: >https://bugs.schedmd.com/show_bug.cgi?id=15120 > >--- Comment #19 from Scott Hilton <scott@schedmd.com> --- >(In reply to Renata Dart from comment #18) >> But the job running on the ampt node registers an allocation of just >> 1 cpu and 20G of memory: >> >>... >> >>while the one on the tur node shows the double allocation: > >The ampt nodes has a "CPU" count equal to the core count 2*64=128. (Sockets * >CoresPerSocket = CPUS) >>NodeName=ampt[000-020] CPUs=128 RealMemory=1029344 Sockets=2 CoresPerSocket=64 ThreadsPerCore=2 ... > >The tur nodes have a "CPU" count equal to the thread count 2*12*2=48. (Sockets >* CoresPerSocket * ThreadsPerCore = CPUS) >>NodeName=tur[000-026] CPUs=48 RealMemory=191552 Sockets=2 CoresPerSocket=12 ThreadsPerCore=2 ,,, > >Slurm supports both situations. In the first a "CPU" is a core in the second a >"CPU is a thread. See the documentation: >https://slurm.schedmd.com/slurm.conf.html#OPT_CPUs > >-Scott > >-- >You are receiving this mail because: >You reported the bug.
Hi Scott, all fair questions to which I don't know the answers. I'm giving feedback to the gpu admin now and will see if he needs any further help in understanding how to configure the entries. If the decision is to change the turs to be like the ampt hosts, will that require more than a restart of slurmctld and an scontrol reconfig? Renata On Thu, 13 Oct 2022, bugs@schedmd.com wrote: >https://bugs.schedmd.com/show_bug.cgi?id=15120 > >--- Comment #20 from Scott Hilton <scott@schedmd.com> --- >Renata, > >What is your use case for this setup? > >Are you saying you want jobs to be allowed to be allocated just a 1 thread on a >core instead of both? i.e. allocating half a core at a time. > >Do your users want to run multithreaded jobs or non-multithreaded jobs choose >with each job? > >Do you want the "CPU" count to match the core count for accounting purposes? > >-Scott > >-- >You are receiving this mail because: >You reported the bug.
(In reply to Renata Dart from comment #22) > If the > decision is to change the turs to be like the ampt hosts, will that > require more than a restart of slurmctld and an scontrol reconfig? I think that should work.
Renata, I am closing this ticket as info given. If you have questions about this specific issue feel free to reopen this ticket. -Scott