Ticket 13085

Summary: job_container/tmpfs + init.sh/prolog.sh
Product: Slurm Reporter: Manuel Holtgrewe <manuel.holtgrewe>
Component: ConfigurationAssignee: Marcin Stolarek <cinek>
Status: RESOLVED FIXED QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: bas.vandervlies, cinek, felip.moll, lyeager
Version: 21.08.5   
Hardware: Linux   
OS: Linux   
See Also: https://bugs.schedmd.com/show_bug.cgi?id=7477
https://bugs.schedmd.com/show_bug.cgi?id=13242
https://bugs.schedmd.com/show_bug.cgi?id=13546
Site: Berlin Institute of Health Slinky Site: ---
Alineos Sites: --- Atos/Eviden Sites: ---
Confidential Site: --- Coreweave sites: ---
Cray Sites: --- DS9 clusters: ---
Google sites: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- NoveTech Sites: ---
Nvidia HWinf-CS Sites: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Tzag Elita Sites: ---
Linux Distro: --- Machine Name:
CLE Version: Version Fixed: 22.05pre1
Target Release: --- DevPrio: ---
Emory-Cloud Sites: ---

Description Manuel Holtgrewe 2021-12-27 06:21:56 MST
I would like to create the following setup:

1. create a Gres "localtmp" for tracking available local temporary space
2. use job_container/tmpfs for managing the /tmp folder with Linux namespaces
3. enforce the allocated temporary file space by using XFS project quotas

I could do 1.&2. easily.

However, it looks like that while the prolog script file has the SLURM_JOBID environment variable, the namespace has not yet been created and the /tmp mount does not exist and the tmpfs init script file does not see the SLURM_JOB id.

Can you recommend a way to achieve my aim here?
Comment 2 Manuel Holtgrewe 2021-12-27 08:44:51 MST
Thanks for the pointer to 'see also'.

I was already able to implement the GRES.

I also know how to set an XFS quota per project. What I would need is a robust way to know the requested GRES of the job and the path to the /tmp directory from  job_container/tmpfs.

To my understanding, having SLURM_JOBID in the job_container/tmpfs InitScript would be sufficient as I assume that this is called after setting up the namesapce and bind mount.

Maybe passing the SLURM_* environment variables into this script would be sufficient?
Comment 6 Marcin Stolarek 2021-12-28 03:30:34 MST
Manuel,

I see your point, however, just adding JobId may not be optimal. I guess you were thinking about the internal use of tools like `scontrol show job JOBID` to get the required information. Such an approach will generate additional REQUEST_JOB_INFO RPC per every job start (potentially from every node), this in fact may have a severe impact on scheduler performance (especially in HTC environment).

We'll have an internal discussion on how to approach that. I'll keep you posted.

cheers,
Marcin
Comment 15 Manuel Holtgrewe 2022-02-08 05:31:04 MST
Hi, is there any news on this?
Comment 16 Marcin Stolarek 2022-02-08 05:35:58 MST
Manuel,

Sorry for the delay. We have a patch under review. We decided to add some basic environment variables (like SLURM_JOB_ID) to the script.

The change is targeted to Slurm 22.05, but should be easy to backport locally.

cheers,
Marcin
Comment 19 Marcin Stolarek 2022-03-02 01:25:04 MST
Manuel,

We've merged a basic environment setup for InitScript of job_container/tmpfs. This is in the master branch[1] and will be part of Slurm 22.05 release.

We're looking into further improvements in this area, since calling `scontrol show job` in InitScript will result in a high load on slurmctld side limiting the system throughput. Those improvements (providing more information to InitScript) require a more complicated rewrite, so I can't commit to anything more in Slurm 22.05 at the moment.

Is there anything else I can help you with in this bug report?

cheers,
Marcin

[1]https://github.com/SchedMD/slurm/commit/e25270e53f57be9aae48759ea5fdd57c9f7eb6b6
Comment 20 Manuel Holtgrewe 2022-03-03 09:09:20 MST
Hi Marcin,

thanks a lot for this already! I'll have a look whether we can bear the additional RPC pressure.

Best wishes,
Manuel