Created attachment 9716 [details] slurm.conf I'm getting errors like this occasionally: [2019-03-27T16:18:32.769] error: select/cons_res: node v4 memory is under-allocated (0-2048) for JobId=61149 I reproduced this with a bunch of job submissions like this: marshall@voyager:~/slurm/18.08/voyager$ for i in {1..20}; do sbatch --mem=2G -Dtmp -N1 --wrap="srun whereami 1"; done marshall@voyager:~/slurm/18.08/voyager$ sacct -j 61149 --format=jobid,alloctres%30,reqtres%30 JobID AllocTRES ReqTRES ------------ ------------------------------ ------------------------------ 61149 billing=2,cpu=2,mem=2G,node=1 billing=1,cpu=1,mem=2G,node=1 61149.batch cpu=2,mem=2G,node=1 61149.extern billing=2,cpu=2,mem=2G,node=1 61149.0 cpu=1,mem=2G,node=1 I'm guessing it has something to do with requesting 1 CPU, but having CR_core_memory and hyperthreading, so that I get two cpu's. I'm attaching my slurm.conf. Bug 6639 had these errors, though it appears the customer didn't file a separate ticket for them.
With this config: SelectType=select/cons_res SelectTypeParameters=CR_Core_Memory NodeName=compute[1-2] SocketsPerBoard=1 CoresPerSocket=4 ThreadsPerCore=2 RealMemory=12000 NodeHostname=polaris State=UNKNOWN Port=61201-61202 GRES=gpu:tesla:2 PartitionName=p1 Nodes=ALL Default=YES State=UP DefMemPerCPU=800 Running the regression on 18.08 HEAD I got the under-allocated error: [2019-04-11T15:30:39.666] error: select/cons_res: node compute1 memory is under-allocated (0-1600) for JobId=20186 so went to regression log and found job 20186 was submitted from within TEST: 1.42 spawn /home/alex/slurm/18.08/install/bin/sbatch --output=/dev/null --error=/dev/null -t1 test1.42.input1 Submitted batch job 20185 spawn /home/alex/slurm/18.08/install/bin/srun -t1 --dependency=afterany:20185 /home/alex/slurm/18.08/install/bin/scontrol show job 20185 srun: job 20186 queued and waiting for resources srun: job 20186 has been allocated resources JobId=20185 JobName=test1.42.input1 UserId=alex(1000) GroupId=docker(130) MCS_label=N/A Priority=50000 Nice=0 Account=acct1 QOS=normal JobState=COMPLETED Reason=None Dependency=(null) Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0 RunTime=00:00:11 TimeLimit=00:01:00 TimeMin=N/A SubmitTime=2019-04-11T15:30:27 EligibleTime=2019-04-11T15:30:27 AccrueTime=2019-04-11T15:30:27 StartTime=2019-04-11T15:30:27 EndTime=2019-04-11T15:30:38 Deadline=N/A PreemptTime=None SuspendTime=None SecsPreSuspend=0 LastSchedEval=2019-04-11T15:30:27 Partition=p1 AllocNode:Sid=polaris:10058 ReqNodeList=(null) ExcNodeList=(null) NodeList=compute1 BatchHost=compute1 NumNodes=1 NumCPUs=2 NumTasks=0 CPUs/Task=1 ReqB:S:C:T=0:0:*:* TRES=cpu=2,mem=1600M,node=1,billing=2 Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=* MinCPUsNode=1 MinMemoryCPU=800M MinTmpDiskNode=0 Features=(null) DelayBoot=00:00:00 OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null) Command=/home/alex/slurm/source/testsuite/expect/test1.42.input1 WorkDir=/home/alex/slurm/source/testsuite/expect StdErr=/dev/null StdIn=/dev/null StdOut=/dev/null Power= SUCCESS if that helps for reproducing.
Marshall, I already worked with the friend bug "memory is overallocated", also working on another bug in the area of will_run_test. I'm also annoyed enough by this error at this point... do you mind if I steal it from you? :) After adding some debug logs: slurmctld: sched: _slurm_rpc_allocate_resources JobId=22882 NodeList=compute1 usec=2108 slurmctld: _job_complete: JobId=22882 WEXITSTATUS 0 slurmctld: select_p_job_fini: calling rm_job_res for JobId=22882 slurmctld: select/cons_tres: rm_job_res: node compute1 removing memory (1600-1600) for JobId=22882 slurmctld: _job_complete: JobId=22882 done slurmctld: will_run_test: future_usage = _dup_node_usage(select_node_usage); slurmctld: will_run_test: future_usage[0].alloc_memory=0 slurmctld: will_run_test: p2 calling rm_job_res to remove JobId=22882 to see if JobId=22883 will run when the former ends slurmctld: error: select/cons_tres: rm_job_res: node compute1 memory is under-allocated (0-1600) for JobId=22882 Looks like the problem is we are double deallocating resources for completing jobs that are removed in will_run_test when emulating a future scenario. The first deallocation happens when the first job finishes: slurmctld: select_p_job_fini: calling rm_job_res for JobId=22882 slurmctld: select/cons_tres: rm_job_res: node compute1 removing memory (1600-1600) for JobId=22882 At this point, we have already removed the resources from the node usage. Then next job triggers will_run_test, which builds a list of candidate jobs to remove resources from in an iterative way and predict when/where the job can start to estime the start time. Since completing jobs are included in the candidate list, we call rm_job_res again for jobs that already deallocated their resources, and that triggers the error. Working on a fix now.
I want to note that the second rm_job_res is performed on duped structs: future_part = _dup_part_data(select_part_record); future_usage = _dup_node_usage(select_node_usage); and thus the "double deallocation" doesn't actually happen on the original resources but on the duped ones. This means the error is less concerning as it could be, but still, we need to fix it since it can impact accurate predictions or mess with the future_* structs.
*** Ticket 7221 has been marked as a duplicate of this ticket. ***
(In reply to Regine Gaudin from comment #0) > Hello > As suggested in bug 6879, I'm opening this bug for annoying messages in > slurmctld.log filling it too fast: > "Hi > > I'm updating this bug as CEA is also encountering memory under-allocated > errors > you have mentionned (bug 6769), filling slurmctld.log > error: select/cons_res: node machine1234 memory is under-allocated > (0-188800) for JobID=XXXXXX > same one ...repeated > > As you wrote "there are proposed fixes for both issues I mentioned > (accrue_cnt underflow and memory under-allocated errors)", I let us known > that CEA would be also interested in proposed fixes. slurm controller is > 18.08.06 and clients in 17.11.6 but will be upgraded soon in 18.08.06 > > Thanks > > Regine" > > [tag] [reply] [−] Comment 11 Marshall Garey 2019-06-10 10:22:12 MDT > > Regine - the patches for both bugs are pending internal QA/review. They'll > both definitely be in 19.05, and probably will both be in 18.08. Although I > hope they'll both be in the next tag, I can't promise that. If you'd like > patches provided before they're in the public repo, can you create a new > ticket for that? > > > Thanks for providing patches for 18 > > Regine I'll let Alex or others comment further. However, I have to redact my statement that they'll both "probably" be in 18.08. I found out that this one is being written for 19.05 and not 18.08, so further discussion will be needed to determine if it will be backported to 18.08.
Hi, this has been fixed in the following wall of commits in 19.05: 2dd1f448ca 0666db61ca 61269349c3 The select/cons_res and select/cons_tres plugins contain many parts with same logic. In master branch (future 20.02 tag) there's been some code refactoring to put this logic in a select/cons_common place. That's why merging the previous commits up to master required some extra work. Also while working on this bug we noticed there was unneeded Cray NHC logic around the same fix area and this has been removed as well. These are the 20.02 commits reflecting all these changes: d4913ae9a1a3 889615a6f4f8 fdb9474e9aa0 6b4d41d037ac I'm closing this bug. Please re-open if there's anything else from there. Thanks.
*** Ticket 7866 has been marked as a duplicate of this ticket. ***