Ticket 3847 - Nodes can easily get overallocated by exclusive jobs
Summary: Nodes can easily get overallocated by exclusive jobs
Status: RESOLVED DUPLICATE of ticket 3879
Alias: None
Product: Slurm
Classification: Unclassified
Component: Scheduling (show other tickets)
Version: 17.02.3
Hardware: Linux Linux
: 6 - No support contract
Assignee: Jacob Jenson
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2017-05-30 12:43 MDT by Thomas Opfer
Modified: 2017-07-20 02:30 MDT (History)
1 user (show)

See Also:
Site: -Other-
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description Thomas Opfer 2017-05-30 12:43:10 MDT
When I run a Job in non-exclusive mode, everything is fine:

to86cola@hla0002:~$ /opt/slurm/current/bin/srun -n 1 --mem-per-cpu=25000 -t 30 -C mpi --pty bash
srun: job 3439844 queued and waiting for resources
srun: job 3439844 has been allocated resources
to86cola@hpa0001:~$ scontrol show node hpa0001|grep TRES
   CfgTRES=cpu=16,mem=28000M
   AllocTRES=cpu=1,mem=25000M
to86cola@hpa0001:~$


When I instead run the same job im exclusive mode, it gets overallocated:

to86cola@hla0002:~$ /opt/slurm/current/bin/srun -n 1 --exclusive --mem-per-cpu=25000 -t 30 -C mpi --pty bash
srun: job 3439868 queued and waiting for resources
srun: job 3439868 has been allocated resources
to86cola@hpa0001:~$ scontrol show node hpa0001|grep TRES
   CfgTRES=cpu=16,mem=28000M
   AllocTRES=cpu=16,mem=400000M
to86cola@hpa0001:~$


In my opinion, the memory to allocate per cpu should be calculated by something like (requested_mem_per_cpu_on_this_node*requested_cpus_on_this_node)/allocated_cpus_on_this_node.


Please fix this as it causes lots of messages in slurmctld.log, e.g.:

[2017-05-30T20:38:21.839] error: cons_res: node hpa0196 memory is overallocated (32000) for job 3439883
[2017-05-30T20:38:21.841] error: cons_res: node hpa0197 memory is overallocated (32000) for job 3439884
[2017-05-30T20:38:52.642] error: cons_res: node hpa0201 memory is overallocated (32000) for job 3439885
[2017-05-30T20:38:52.643] error: cons_res: node hpa0205 memory is overallocated (32000) for job 3439886
[2017-05-30T20:38:52.645] error: cons_res: node hpa0306 memory is overallocated (32000) for job 3439887
[2017-05-30T20:38:52.646] error: cons_res: node hpa0312 memory is overallocated (32000) for job 3439888

And also in slurmdbd.log, e.g.:

[2017-05-30T20:24:21.086] error: We have more allocated time than is possible (272381215680 > 250059600000) for cluster lcluster(69461000) from 2017-05-21T06:00:00 - 2017-05-21T07:00:00 tres 2
[2017-05-30T20:24:21.086] error: We have more time than is possible (250059600000+1267200000+0)(251326800000) > 250059600000 for cluster lcluster(69461000) from 2017-05-21T06:00:00 - 2017-05-21T07:00:00 tres 2
[2017-05-30T20:24:31.373] error: We have more allocated time than is possible (270273934400 > 250059600000) for cluster lcluster(69461000) from 2017-05-21T07:00:00 - 2017-05-21T08:00:00 tres 2
[2017-05-30T20:24:31.373] error: We have more time than is possible (250059600000+1267200000+0)(251326800000) > 250059600000 for cluster lcluster(69461000) from 2017-05-21T07:00:00 - 2017-05-21T08:00:00 tres 2
[2017-05-30T20:24:41.025] error: We have more allocated time than is possible (270435154400 > 250059600000) for cluster lcluster(69461000) from 2017-05-21T08:00:00 - 2017-05-21T09:00:00 tres 2
[2017-05-30T20:24:41.025] error: We have more time than is possible (250059600000+1267200000+0)(251326800000) > 250059600000 for cluster lcluster(69461000) from 2017-05-21T08:00:00 - 2017-05-21T09:00:00 tres 2
[2017-05-30T20:24:50.262] error: We have more allocated time than is possible (269253026080 > 250059600000) for cluster lcluster(69461000) from 2017-05-21T09:00:00 - 2017-05-21T10:00:00 tres 2
[2017-05-30T20:24:50.262] error: We have more time than is possible (250059600000+1267200000+0)(251326800000) > 250059600000 for cluster lcluster(69461000) from 2017-05-21T09:00:00 - 2017-05-21T10:00:00 tres 2
[2017-05-30T20:24:59.087] error: We have more allocated time than is possible (260691142920 > 250059600000) for cluster lcluster(69461000) from 2017-05-21T10:00:00 - 2017-05-21T11:00:00 tres 2
[2017-05-30T20:24:59.087] error: We have more time than is possible (250059600000+1267200000+0)(251326800000) > 250059600000 for cluster lcluster(69461000) from 2017-05-21T10:00:00 - 2017-05-21T11:00:00 tres 2
[2017-05-30T20:25:09.605] error: We have more allocated time than is possible (255667014200 > 250059600000) for cluster lcluster(69461000) from 2017-05-21T11:00:00 - 2017-05-21T12:00:00 tres 2
[2017-05-30T20:25:09.605] error: We have more time than is possible (250059600000+1267200000+0)(251326800000) > 250059600000 for cluster lcluster(69461000) from 2017-05-21T11:00:00 - 2017-05-21T12:00:00 tres 2
Comment 1 Thomas Opfer 2017-07-20 02:30:56 MDT
This will be resolved when bug 3879 is resolved.

*** This ticket has been marked as a duplicate of ticket 3879 ***