Ticket 352

Summary: Requested nodes are busy when job --mem-per-cpu option > MaxMemPerCPU config
Product: Slurm Reporter: Chris Read <cread>
Component: SchedulingAssignee: Moe Jette <jette>
Status: RESOLVED FIXED QA Contact:
Severity: 3 - Medium Impact    
Priority: --- CC: da
Version: 2.5.x   
Hardware: Linux   
OS: Linux   
Site: DRW Trading Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---
Attachments: Slurm.conf with MaxMemPerCPU commented out
Disable setting implicit value of a job's cpus_per_task value

Description Chris Read 2013-06-27 08:18:57 MDT
When we set the following in slurm.conf:

MaxMemPerCPU=2048

And run the following command:

srun --mem-per-cpu=4G hostname

We get the following on the command line:

srun: job 27801 queued and waiting for resources
srun: job 27801 has been allocated resources
srun: Job step creation temporarily disabled, retrying

With slurmctld and slurmd both running with '-vvv' we see the following in the log files:

slurmctld.log:

[2013-06-14T14:42:53-05:00] debug2: select_p_job_test for job 27801
[2013-06-14T14:42:53-05:00] debug2: got 1 threads to send out
[2013-06-14T14:42:53-05:00] debug2: _adjust_limit_usage: job 27801: MPC: job_memory set to 4096
[2013-06-14T14:42:53-05:00] debug2: Tree head got back 0 looking for 3
[2013-06-14T14:42:53-05:00] sched: Allocate JobId=27801 NodeList=n32 #CPUs=2
[2013-06-14T14:42:53-05:00] debug2: Spawning RPC agent for msg_type 4002
[2013-06-14T14:42:53-05:00] debug2: Performing full system state save
[2013-06-14T14:42:53-05:00] debug2: got 1 threads to send out
[2013-06-14T14:42:53-05:00] debug2: Tree head got back 1
[2013-06-14T14:42:53-05:00] debug2: Tree head got back 2
[2013-06-14T14:42:53-05:00] debug2: Tree head got back 3
[2013-06-14T14:42:53-05:00] debug2: Tree head got them all
[2013-06-14T14:42:53-05:00] debug2: _slurm_rpc_job_ready(27801)=3 usec=6
[2013-06-14T14:42:53-05:00] debug2: Processing RPC: REQUEST_JOB_STEP_CREATE from uid=0
[2013-06-14T14:42:53-05:00] debug:  Configuration for job 27801 complete
[2013-06-14T14:42:53-05:00] _slurm_rpc_job_step_create for job 27801: Requested nodes are busy
[2013-06-14T14:42:53-05:00] debug2: node_did_resp n31
[2013-06-14T14:42:53-05:00] debug2: node_did_resp n33
[2013-06-14T14:42:53-05:00] debug2: node_did_resp n32
[2013-06-14T14:42:53-05:00] debug2: Processing RPC: REQUEST_JOB_STEP_CREATE from uid=0
[2013-06-14T14:42:53-05:00] debug:  Configuration for job 27801 complete
[2013-06-14T14:42:53-05:00] _slurm_rpc_job_step_create for job 27801: Requested nodes are busy
[2013-06-14T14:42:54-05:00] debug2: Processing RPC: REQUEST_JOB_STEP_CREATE from uid=0
[2013-06-14T14:42:54-05:00] debug:  Configuration for job 27801 complete
[2013-06-14T14:42:54-05:00] _slurm_rpc_job_step_create for job 27801: Requested nodes are busy
[2013-06-14T14:42:54-05:00] debug2: Processing RPC: REQUEST_JOB_STEP_CREATE from uid=0
[2013-06-14T14:42:54-05:00] debug:  Configuration for job 27801 complete
[2013-06-14T14:42:54-05:00] _slurm_rpc_job_step_create for job 27801: Requested nodes are busy
[2013-06-14T14:42:55-05:00] debug2: Processing RPC: REQUEST_JOB_STEP_CREATE from uid=0
[2013-06-14T14:42:55-05:00] debug:  Configuration for job 27801 complete
[2013-06-14T14:42:55-05:00] _slurm_rpc_job_step_create for job 27801: Requested nodes are busy
[2013-06-14T14:42:57-05:00] debug2: Processing RPC: REQUEST_JOB_STEP_CREATE from uid=0
[2013-06-14T14:42:57-05:00] debug:  Configuration for job 27801 complete
[2013-06-14T14:42:57-05:00] _slurm_rpc_job_step_create for job 27801: Requested nodes are busy
[2013-06-14T14:43:02-05:00] debug2: Processing RPC: REQUEST_JOB_STEP_CREATE from uid=0
[2013-06-14T14:43:02-05:00] debug:  Configuration for job 27801 complete
[2013-06-14T14:43:02-05:00] _slurm_rpc_job_step_create for job 27801: Requested nodes are busy


slurmd.log:

[2013-06-14T14:42:53-05:00] debug2: got this type of message 1011
[2013-06-14T14:42:53-05:00] debug2: Processing RPC: REQUEST_HEALTH_CHECK
[2013-06-14T14:42:53-05:00] debug:  attempting to run health_check [/srv/slurm/sbin/healthcheck.sh]


It looks as thought the problem is solely with slurmctld as the slurmd never seems to get any request for the job!

I would expect the behaviour to be that the job submission should just be rejected.
Comment 1 Moe Jette 2013-06-27 08:51:34 MDT
Could you attach your slurm.conf configuration file?

It can also be helpful to identify the specific version of Slurm in the trouble ticket, which I believe is v2.5.7 in your case.
Comment 2 Chris Read 2013-06-27 08:55:13 MDT
Created attachment 312 [details]
Slurm.conf with MaxMemPerCPU commented out

Here is the config with the MaxMemPerCPU commented out. 

I get the same behaviour with 2.5.6 and 2.5.7.

Chris
Comment 3 Moe Jette 2013-06-27 09:41:20 MDT
Created attachment 315 [details]
Disable setting implicit value of a job's cpus_per_task value

This removes logic added three years ago that would automatically set a job's cpus_per_task value in order to reset a job's mem_per_cpu value and scale the cpus_per_task by the same value. Equivalent logic did not exist in the step allocation logic. Just return an error instead. This change will be made in Slurm version 2.6, but this batch is made for version 2.5. The original patch introducing the problem is in commit: cc00cc70b9c90816afc511e0261e449857176332
This is commit e3b7c2be4393d921679f3e0cddcb9ca7943fb1f6
Comment 4 Moe Jette 2013-06-27 09:42:16 MDT
See attached patch
Comment 5 Chris Read 2013-06-27 10:21:42 MDT
Thanks, tested in our dev environment, confirmed fixed.