Ticket 15787

Summary: interactive job with same environment as regular job
Product: Slurm Reporter: hpc-admin
Component: User CommandsAssignee: Oscar Hernández <oscar.hernandez>
Status: RESOLVED TIMEDOUT QA Contact:
Severity: 4 - Minor Issue    
Priority: ---    
Version: 22.05.7   
Hardware: Linux   
OS: Linux   
Site: Ghent Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description hpc-admin 2023-01-13 04:56:35 MST
hello,

during handling of issue 15614 we were told that we were starting interactive jobs "the wrong way", and it also appears there is no single correct way; and even then it's not really providing us what we are looking for.

we as admins (and also most (our) users) would like to use some form of interactive job option combo that gives us the same environment as a regular jobscript, typically to debug jobscripts. for that we eg now start tmux via a job and connect to the tmux session, but that is a a bit complicated to say the least. 

from what we understand now we would have to set overlap, but setting overlap is hardly the same as a regular job.

also, users have to be educated to choose upfront between exclusive and/or overlap, which is another thing we want to avoid. sure advanced users can be explained the difference, but that is not most of our userbase.

we will probably develop a wrapper script to do the tmux trickery, for which it would be helpful that there was a sbatch option to wait till the jobs is started (afaik, there is now an option to wait till the job finished; but no way to only wait for it's start)


stijn
Comment 1 Oscar Hernández 2023-01-18 04:34:27 MST
Hi Stijn,

>(afaik, there is now an option to wait till the job finished;
>but no way to only wait for it's start)
That's right, this feature is not currently available and should be treated as an enhancement. However, as I see it, "salloc" already covers that part, so ideally we should try to get "salloc" to behave as you are interested. 

Went through bug 15614, but there are different things discussed there, and I am not sure which is your current conflict with the way "salloc" works (what difference with sbatch is causing trouble). 

Could it be possible for you to give a practical example? Stating the commands, what you are granted and what would be desired instead.

Also, did you finally set the option LaunchParameters=use_interactive_step?

Looking at your wrappers, I do not see clearly what is the objective of launching a 'srun' within the salloc command. Salloc by default should already grant you a bash terminal.

I think this will help us get on point, to evaluate whether there is any available option to achieve your goals, or rather something is not working as expected.

Thanks!
Oscar
Comment 2 Oscar Hernández 2023-02-01 08:34:14 MST
Hi Stijn,

I am just checking in, in case you can provide any of the feedback requested in my last comment. Are you still having issues with the way salloc grants resources?

Kind regards,
Oscar
Comment 3 Oscar Hernández 2023-02-22 11:59:41 MST
Hi Stijn,

Since we have not heard back in more than a month, I am closing as timeout this one. 

Do not hesitate to reopen for any other question or to provide the extra details requested.

Kind regards,
Oscar