Do you recommend we have HyperThreading enabled or disabled on the SlurmCtld (and SlurmDBD) hosts? We purchased some new hardware. Tim provided us some nice guidance about the type of Hardware to purchase in regards to the CPU (below), but we are curious if it's best to have HyperThreading enabled or disabled on servers dedicated to running slurmctld and slurmdbd (~one host per service). Tim's insight: CPUs tend to govern responsiveness... but I don't recommend the most expensive ones. Fewer cores but a higher clock rate is vastly preferable - Slurm will generally not use more than ~4 cores in most cases, as while the internals of slurmctld are highly threaded, internal datastructure locking limits the concurrency to only a few active threads at once. Thus, a high clock rate / high IPS processor is greatly preferred. And those are usually less expensive than the lower-clocked parts with tons of cores. --- We did buy some new boxes for Jet and are putting them into production next week with the following proc. HyperThreading is currently enabled: lscpu | grep -i -E "^CPU\(s\):|core|socket|Model name" CPU(s): 16 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 2 Model name: Intel(R) Xeon(R) Gold 5222 CPU @ 3.80GHz Thanks, Shawn
We haven't done any specific profiling for hyperthreaded vs non-hyperthreaded. Perhaps non-hyperthreaded will be a little faster or they will be the same depending on how the OS decides to schedule threads. But that's just guessing. I have a box with hyperthreading and I've done testing with and without hyperthreading enabled. I don't have a real production workload to test out, but I haven't noticed a difference myself.
For what it's worth, my colleague Broderick did a presentation at SLUG 2019[1] where he demonstrated high-throughput computing. His slurmctld had hyperthreading turned on, and he was able to demonstrate very impressive performance. He didn't test anything about hyperthreading specifically, and I don't know if hyperthreading was helping, hindering, or neutral in those tests. But this shows you can get very good performance with hyperthreading enabled. So I don't think you'll need to worry about it. [1] https://slurm.schedmd.com/SLUG19/High_Throughput_Computing.pdf
Closing as infogiven. If you have other question, feel free to re-open the bug.
Hi Marshall, Appreciate the info. We did go into production with these servers with Hyper-Threading enabled after a Downtime on 12.10. Thanks, Shawn On Mon, Dec 16, 2019 at 3:27 PM <bugs@schedmd.com> wrote: > Marshall Garey <marshall@schedmd.com> changed bug 8192 > <https://bugs.schedmd.com/show_bug.cgi?id=8192> > What Removed Added > Resolution --- INFOGIVEN > Status OPEN RESOLVED > > *Comment # 4 <https://bugs.schedmd.com/show_bug.cgi?id=8192#c4> on bug > 8192 <https://bugs.schedmd.com/show_bug.cgi?id=8192> from Marshall Garey > <marshall@schedmd.com> * > > Closing as infogiven. If you have other question, feel free to re-open the bug. > > ------------------------------ > You are receiving this mail because: > > - You reported the bug. > >