Hi SchedMD, We need to replace the hardware our scheduler and DB run on. Is there any inherent reason it could not be installed on VMs? We can scale VMs with lots of CPU, memory, and fast disks these days and a VM is just a lot nicer to work with. Thanks, EWG
(In reply to Elijah Gagne from comment #0) > We need to replace the hardware our scheduler and DB run on. Is there any > inherent reason it could not be installed on VMs? We can scale VMs with lots > of CPU, memory, and fast disks these days and a VM is just a lot nicer to > work with. slurmd, slurmctld, slurmdbd, and MySQL/MariaDB can run on VMs. It is up to the site to ensure that the VMs have sufficient CPUs/Memory/Network for their cluster's load and/or jobs. The most common issue we see with VMs is that the clock source is not high enough precision to the point that we now have `sdiag` warn when this is detected. Depending on the VM vendor, most now provide client kernel drivers that provide high precision clock source but usually have to be enabled manually. Please also note that slurmctld, slurmdbd, and MySQL/MariaDB can be run inside of containers if a site doesn't want to pay the performance penalty for virtualization. Do you have any more questions?
I wanted to emphasize this part: > It is up to the site to ensure that the VMs have sufficient CPUs/Memory/Network for their cluster's load and/or jobs. Many sites have ended up pulling Slurm off their VM systems due to them being too slow or other performance issues. While these issues are not specific to Slurm, they do have an unfortunate tendency of making Slurm look slow. Most VMs have a penalty of 10% for performance so the hardware needs to be that much faster to meet our suggestions here (slides 18-20): > https://slurm.schedmd.com/SLUG21/Field_Notes_5.pdf If a VM is used, it is required that the CPUs and Memory be pinned for Slurm's VMs.
Thanks. I think we're good to close this out. -EWG