The gpu nodes (Tesla M2090) had two gpus. However, even if one gpu is requested, the other gpu won't run in that gpu though it has enough processors and one more gpu. Examples: # Request a gpu from gpu012: srun --gres=gpu:1 -c 2 -N 1 -p gpufermi --nodelist=gpu012t --pty /bin/bash [sxg125@gpu012t ~]$ # Now, request another gpu from the same node: srun --gres=gpu:1 -c 2 -N 1 -p gpufermi --nodelist=gpu012t --pty /bin/bash srun: job 122118 queued and waiting for resources The job is waiting for the resource though it has it. I would appreciate your help in resolving this issue. Thank you, -Sanjaya (email: sxg125@case.edu)
Can you please attach your slurm.conf and gres.conf?
Created attachment 2607 [details] SLURM Config File cat /usr/local/slurm/gres.conf NodeName=quad06t Name=gpu File=/dev/nvidia[0-1] NodeName=quad07t Name=gpu File=/dev/nvidia[0-1] NodeName=gpu008t Name=gpu File=/dev/nvidia[0-1] NodeName=gpu009t,gpu010t,gpu011t,gpu013t,gpu014t,gpu016t Name=gpu File=/dev/nvidia[0-1] NodeName=gpu015t Name=gpu File=/dev/nvidia0 NodeName=gpu025t,gpu026t,gpu027t,gpu028t,gpu029t,gpu030t Name=gpu File=/dev/nvidia[0-1] NodeName=gpu012t Name=gpu File=/dev/nvidia0 CPUs=0,1 NodeName=gpu012t Name=gpu File=/dev/nvidia1 CPUs=2,3
You're going to need to either set DefMemPerCPU, or provide --mem limits during job submission. Without either of those, Slurm is allocating all of the memory in the node to the first job, and any successive job (regardless of GRES requests) will be stuck waiting for resources. If you'd rather not track and allocate node memory you can change SelectType to CR_Core instead to disable it entirely. A few notes from reading through the config - - If you're able to drop the trailing 't' from the nodenames, you'd be able to collapse most of the configuration substantially using ranges. Your batch partition could be simplified to: PartitionName=batch Nodes=comp[001,002,009-016,125-196] Priority=3 Default=YES Output of sinfo, squeue, and other commands tools would also be much more readable. - Partition Priority may not be doing what you expect. It's causes the system to schedule things separately on different tiers - if any jobs are in a higher priority partition they will schedule ahead of jobs in lower priority partition on nodes in common to both, regardless of the multifactor priority values. We expect going to change that setting in 16.05 to make this clear - I'm assuming you're using it here just to change the multifactor weights. In 15.08 you can have an effect on that by changing the PriorityWeightTRES value on that partition instead - http://slurm.schedmd.com/tres.html . You can also use PriorityWeightTRES to favor jobs based on memory or GPU requests - GPU requests in particular seem to be a useful metric for GPU nodes, as you'd likely favor a job that wants just 1 CPU but 2 GPUs over a job asking for 8 CPUs but no GPUs on that same hardware.
Thank you for your help. It is working now. Could you please add me to this service request along with Hadrian Djohari? My details are below: Sanjaya Gajurel Email: sxg125@case.edu Computational Scientist Case Western Reserve Univesity
Adding Sanjaya Gajurel to the bug. His account in Bugzilla has just been created (he'll see an email about it), he can go through password recovery if he wants to login and change anything. Have you had a chance to look at setting DefMemPerCPU yet? I'd like to verify that's the cause of the problem you're having. cheers, - Tim
Yes, DefMemPerCPU line was commented. It is working now. Thank you, -Sanjaya
(In reply to Hadrian from comment #6) > Yes, DefMemPerCPU line was commented. It is working now. > > Thank you, > > -Sanjaya Glad to hear that fixed it. Marking as resolved now.