This problem is similar to Bug 4244 which was noted as being fixed. We are running 17.11.3-2 And a gres.conf file in the form: NodeName=hpc-test06 Name=gpu Type=k20 File=/dev/nvidia0 Cores=0-15 NodeName=hpc-test06 Name=gpu Type=k20 File=/dev/nvidia1 Cores=0-15 works But the desired configuration: NodeName=hpc-test06 Name=gpu Type=k20 File=/dev/nvidia0 Cores=0-7 NodeName=hpc-test06 Name=gpu Type=k20 File=/dev/nvidia1 Cores=8-15 does not. We are using the selecttype linear: SelectType=select/linear SelectTypeParameters=CR_ONE_TASK_PER_CORE,CR_Memory The request for a gpu: srun --gres=gpu:k20:1 --ntasks=2 --cpus-per-task=8 /bin/bash -c 'echo $CUDA_VISIBLE_DEVICES' results in the error message: srun: error: Unable to allocate resources: Requested node configuration is not available This appears to be related to the test: if (gres_cpus != NO_VAL) { gres_cpus *= cpus_per_core; if ((gres_cpus < cpu_cnt) || (gres_cpus < job_ptr->details->ntasks_per_node) || ((job_ptr->details->cpus_per_task > 1) && (gres_cpus < job_ptr->details->cpus_per_task))) { bit_clear(jobmap, i); continue; } } in the function: _job_count_bitmap in the file: src/plugins/select/linear/select_linear.c Specifically the test: if ((gres_cpus < cpu_cnt) Which is set by: gres_cores = gres_plugin_job_test(job_ptr->gres_list, gres_list, use_total_gres, NULL, core_start_bit, core_end_bit, job_ptr->job_id, node_ptr->name); gres_cpus = gres_cores; gres_cpus is set to 16 for our configs only if the config parameter is Cores=0-15 Is there some other configuration parameter that I should have set that would have changed this behavior
AJ, Do you know if USC has a Slurm support contact? Our system was not able to associate your email address with a Slurm support contract. If your site has an existing Slurm support contract please email jacob@schedmd.com to figure why your email address is not associated with the contract. If your site does not have a current Slurm support contract please email sales@schedmd.com to request a quote. Once a Slurm support contract is in place this ticket will be routed to the support team for quick resolution. Jacob
Avalon Johnson at USC is the primary contact for Slurm support :)