Summary: | OverMemoryKill enforces only step memory limit, not total usage | ||
---|---|---|---|
Product: | Slurm | Reporter: | CSC sysadmins <csc-slurm-tickets> |
Component: | Limits | Assignee: | Nate Rini <nate> |
Status: | RESOLVED FIXED | QA Contact: | |
Severity: | 4 - Minor Issue | ||
Priority: | --- | ||
Version: | 19.05.6 | ||
Hardware: | Linux | ||
OS: | Linux | ||
Site: | CSC - IT Center for Science | Alineos Sites: | --- |
Atos/Eviden Sites: | --- | Confidential Site: | --- |
Coreweave sites: | --- | Cray Sites: | --- |
DS9 clusters: | --- | HPCnow Sites: | --- |
HPE Sites: | --- | IBM Sites: | --- |
NOAA SIte: | --- | NoveTech Sites: | --- |
Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
Recursion Pharma Sites: | --- | SFW Sites: | --- |
SNIC sites: | --- | Tzag Elita Sites: | --- |
Linux Distro: | --- | Machine Name: | |
CLE Version: | Version Fixed: | 20.02.4, 20.11 | |
Target Release: | --- | DevPrio: | --- |
Emory-Cloud Sites: | --- |
Description
CSC sysadmins
2020-05-05 06:03:04 MDT
Tommi Looking into what Slurm should be doing. --Nate (In reply to Nate Rini from comment #1) > Looking into what Slurm should be doing. Hi, Only reliable solution what comes to my mind is to combine extern step and running job step pss and verify that it's under the limit? Or do you mean case where --mem-per-cpu is set and one extern step consuming memory also? (In reply to Tommi Tervo from comment #2) > (In reply to Nate Rini from comment #1) > > Looking into what Slurm should be doing. After consulting internally about how overmemorykill works, we decided that this is a documentation issue. (Updated here: https://github.com/SchedMD/slurm/commit/b82d7c29f4fabea702dba3b08e9581e450c4f064) Overmemorykill is not suggested due to its inherent limits and instead we suggest using cgroups and 'ConstrainRAMSpace=yes' which will limit the memory on a per job/step basis. > Only reliable solution what comes to my mind is to combine extern step and > running job step pss and verify that it's under the limit? Each step/task (process tree) in a job forks a new slurmstepd instance that would have to communicate with the lead slurmd instance in order to actually implement a limit for the whole job. None of the required RPCs or functionality current exist to implement this with overmemorykill. Extern steps and MPI jobs actual fork secondary tasks instance which also only enforce limits against the single process tree and slurmstepd instances further complicating matters. > Or do you mean case where --mem-per-cpu is set and one extern step consuming memory also? Memory limits are set per job and can be set for steps/tasks when using cgroups and 'ConstrainRAMSpace=yes' due to the built in hierarchy of cgroups in the Linux kernel. There is currently no plan to implement this for Overmemorykill as we don't suggest sites use it anymore. I'm closing this ticket, please reply to this ticket if you have any questions and we can continue from here. Thanks, --Nate |