Ticket 1671

Summary: tmp size limit per user
Product: Slurm Reporter: Hjalti Sveinsson <hjalti.sveinsson>
Component: AccountingAssignee: David Bigagli <david>
Status: RESOLVED INFOGIVEN QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: brian, da
Version: 14.03.5   
Hardware: Linux   
OS: Linux   
Site: deCODE Slinky Site: ---
Alineos Sites: --- Atos/Eviden Sites: ---
Confidential Site: --- Coreweave sites: ---
Cray Sites: --- DS9 clusters: ---
Google sites: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- NoveTech Sites: ---
Nvidia HWinf-CS Sites: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Tzag Elita Sites: ---
Linux Distro: --- Machine Name:
CLE Version: Version Fixed:
Target Release: --- DevPrio: ---
Emory-Cloud Sites: ---

Description Hjalti Sveinsson 2015-05-15 01:43:02 MDT
Hello,

1. Is it possible to limit the use of /tmp on all jobs to not be larger then certain size, for example 1GB when users are running jobs without specifing what they need. 

2. If they specify more then 1GB they will assigned to a node that has that space available on /tmp.

3. Is it possible to that each job can have it's own sub-directory under /tmp? For example if prolog can create the sub-directory with the JobID number and then the epilog process would clean up that directory at end of job even if the job failed for some reason or if the job finished abnormally.

regards,
Hjalti Sveinsson
Comment 1 David Bigagli 2015-05-15 05:34:05 MDT
Hi,
   the sbatch and srun commands have the option to specify the amount of
/tmp space. 

  --tmp=<MB>
         Specify a minimum amount of temporary disk space.

You can use the job submission plugin to test if the user has specified --tmp
at submission and if not set it to be 1GB.

It is also possible what you ask about prolog generating files under /tmp and epilog cleaning them, this is actually one of their use cases. Both prolog
and epilog have access to the jobid and the username. As an example these
are the Slurm environment variables in the prolog:

SLURMD_NODENAME=prometeo
SLURM_CLUSTER_NAME=canis_major
SLURM_CONF=/home/david/clusters/1411/linux/etc/slurm.conf
SLURM_JOBID=461
SLURM_JOB_ID=461
SLURM_JOB_PARTITION=markab
SLURM_JOB_UID=500
SLURM_JOB_USER=david
SLURM_NODELIST=prometeo
SLURM_STEP_ID=0
SLURM_UID=500

and in epilog:

SLURMD_NODENAME=prometeo
SLURM_CLUSTER_NAME=canis_major
SLURM_CONF=/home/david/clusters/1411/linux/etc/slurm.conf
SLURM_JOBID=461
SLURM_JOB_ID=461
SLURM_JOB_UID=500
SLURM_JOB_USER=david
SLURM_NODELIST=prometeo
SLURM_UID=500

David
Comment 2 David Bigagli 2015-05-15 06:06:09 MDT
Hi, 
   let me add that currently Slurm does not have the functionality to constrain the use of tmp space, the same way it does for memory for example.
It is possible in theory but it is not implemented.

David
Comment 3 Hjalti Sveinsson 2015-05-17 22:07:18 MDT
Hi again,

thank you for your answer. 

However i am not sure i understand you when you talk about the job submission plugin, which plugin are you referring to and how do i set the --tmp to be 1G by default if the user does not specify it?

So if i understanding you correctly.

It is not possible to limit/constrain the usage of tmp space?

It is possible to set the default tmp usage to 1GB if not specified? 

Do these first two not conflict with each other?

It is possible to use epilog and prolog to create sub-directory with the job-id under /tmp like this Directory = /tmp/$JOB_ID and have the epilog clean up the directory on finish?

regards,
Hjalti Sveinsson
Comment 4 David Bigagli 2015-05-18 05:39:01 MDT
Hi,
   Slurm implements the job submission plugin which allows you to intercept and
modify any job submission parameter check the paragraph JobSubmitPlugins in
http://slurm.schedmd.com/slurm.conf.html. 
Examples of submission plugins are in the source tree:
slurm/src/plugins/job_submit 

Using the --tmp option it is possible to specify the amount of tmp space the job needs so that it gets dispatched to a host with enough tmp space, but currently 
there is no enforcement to make sure the job is not using more than requested. 

Yes it is definitely possible to create directories indexed by the jobid and then remove them in the epilog.

David
Comment 5 Moe Jette 2015-05-18 06:29:04 MDT
There are additional examples of job_submit plugins written using LUA scripts in the Slurm distribution in the "contribs/lua" subdirectory:

$ pwd
/home/jette/Desktop/SLURM/slurm.git/contribs/lua
$ ls
job_submit.license.lua  job_submit.lua  ...
Comment 6 David Bigagli 2015-05-20 09:01:11 MDT
Information provided.

David