Ticket 13085

Summary:	job_container/tmpfs + init.sh/prolog.sh
Product:	Slurm	Reporter:	Manuel Holtgrewe <manuel.holtgrewe>
Component:	Configuration	Assignee:	Marcin Stolarek <cinek>
Status:	RESOLVED FIXED	QA Contact:
Severity:	4 - Minor Issue
Priority:	---	CC:	bas.vandervlies, cinek, felip.moll, lyeager
Version:	21.08.5
Hardware:	Linux
OS:	Linux
See Also:	https://bugs.schedmd.com/show_bug.cgi?id=7477 https://bugs.schedmd.com/show_bug.cgi?id=13242 https://bugs.schedmd.com/show_bug.cgi?id=13546
Site:	Berlin Institute of Health	Slinky Site:	---
Alineos Sites:	---	Atos/Eviden Sites:	---
Confidential Site:	---	Coreweave sites:	---
Cray Sites:	---	DS9 clusters:	---
Google sites:	---	HPCnow Sites:	---
HPE Sites:	---	IBM Sites:	---
NOAA SIte:	---	NoveTech Sites:	---
Nvidia HWinf-CS Sites:	---	OCF Sites:	---
Recursion Pharma Sites:	---	SFW Sites:	---
SNIC sites:	---	Tzag Elita Sites:	---
Linux Distro:	---	Machine Name:
CLE Version:		Version Fixed:	22.05pre1
Target Release:	---	DevPrio:	---
Emory-Cloud Sites:	---

Description Manuel Holtgrewe 2021-12-27 06:21:56 MST

I would like to create the following setup:

1. create a Gres "localtmp" for tracking available local temporary space
2. use job_container/tmpfs for managing the /tmp folder with Linux namespaces
3. enforce the allocated temporary file space by using XFS project quotas

I could do 1.&2. easily.

However, it looks like that while the prolog script file has the SLURM_JOBID environment variable, the namespace has not yet been created and the /tmp mount does not exist and the tmpfs init script file does not see the SLURM_JOB id.

Can you recommend a way to achieve my aim here?

Comment 2 Manuel Holtgrewe 2021-12-27 08:44:51 MST

Thanks for the pointer to 'see also'.

I was already able to implement the GRES.

I also know how to set an XFS quota per project. What I would need is a robust way to know the requested GRES of the job and the path to the /tmp directory from  job_container/tmpfs.

To my understanding, having SLURM_JOBID in the job_container/tmpfs InitScript would be sufficient as I assume that this is called after setting up the namesapce and bind mount.

Maybe passing the SLURM_* environment variables into this script would be sufficient?

Comment 6 Marcin Stolarek 2021-12-28 03:30:34 MST

Manuel,

I see your point, however, just adding JobId may not be optimal. I guess you were thinking about the internal use of tools like `scontrol show job JOBID` to get the required information. Such an approach will generate additional REQUEST_JOB_INFO RPC per every job start (potentially from every node), this in fact may have a severe impact on scheduler performance (especially in HTC environment).

We'll have an internal discussion on how to approach that. I'll keep you posted.

cheers,
Marcin

Comment 15 Manuel Holtgrewe 2022-02-08 05:31:04 MST

Hi, is there any news on this?

Comment 16 Marcin Stolarek 2022-02-08 05:35:58 MST

Manuel,

Sorry for the delay. We have a patch under review. We decided to add some basic environment variables (like SLURM_JOB_ID) to the script.

The change is targeted to Slurm 22.05, but should be easy to backport locally.

cheers,
Marcin

Comment 19 Marcin Stolarek 2022-03-02 01:25:04 MST

Manuel,

We've merged a basic environment setup for InitScript of job_container/tmpfs. This is in the master branch[1] and will be part of Slurm 22.05 release.

We're looking into further improvements in this area, since calling `scontrol show job` in InitScript will result in a high load on slurmctld side limiting the system throughput. Those improvements (providing more information to InitScript) require a more complicated rewrite, so I can't commit to anything more in Slurm 22.05 at the moment.

Is there anything else I can help you with in this bug report?

cheers,
Marcin

[1]https://github.com/SchedMD/slurm/commit/e25270e53f57be9aae48759ea5fdd57c9f7eb6b6

Comment 20 Manuel Holtgrewe 2022-03-03 09:09:20 MST

Hi Marcin,

thanks a lot for this already! I'll have a look whether we can bear the additional RPC pressure.

Best wishes,
Manuel