Ticket 12111

Summary: Can't get/set environment in job_submit.lua
Product: Slurm Reporter: Felix Abecassis <fabecassis>
Component: slurmctldAssignee: Marshall Garey <marshall>
Status: RESOLVED INFOGIVEN QA Contact: Ben Roberts <ben>
Severity: 3 - Medium Impact    
Priority: --- CC: jbernauer, lyeager, nate
Version: 21.08.x   
Hardware: Linux   
OS: Linux   
See Also: https://bugs.schedmd.com/show_bug.cgi?id=9389
https://bugs.schedmd.com/show_bug.cgi?id=9260
https://bugs.schedmd.com/show_bug.cgi?id=19634
Site: NVIDIA (PSLA) Slinky Site: ---
Alineos Sites: --- Atos/Eviden Sites: ---
Confidential Site: --- Coreweave sites: ---
Cray Sites: --- DS9 clusters: ---
Google sites: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- NoveTech Sites: ---
Nvidia HWinf-CS Sites: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Tzag Elita Sites: ---
Linux Distro: --- Machine Name:
CLE Version: Version Fixed:
Target Release: --- DevPrio: ---
Emory-Cloud Sites: ---

Description Felix Abecassis 2021-07-22 16:44:11 MDT
The documentation (https://slurm.schedmd.com/job_submit_plugins.html) has the following example:
if (job_desc.environment.LANGUAGE == "en_US") then

But from a quick test, I cannot get/set job_desc.environment in job_submit.lua. Using the following script:
function slurm_job_submit(job_desc, part_list, submit_uid)
    if (job_desc.environment.LANGUAGE == "en_US") then
        slurm.log_user("US")
    end
    slurm.log_user("%s", job_desc.environment.PATH)

    return slurm.SUCCESS
end

Submitting a job with an explicit environment, it doesn't work:
$ srun -A admin --export ALL,PATH=/usr/bin:/bin true


slurmctld shows the following error messages:
slurmctld: error: _job_env_field: job_desc->environment is NULL
slurmctld: error: _job_env_field: job_desc->environment is NULL
slurmctld: error: job_submit/lua: /etc/slurm-llnl/job_submit.lua: [string "slurm.user_msg (string.format(table.unpack({...."]:1: bad argument #2 to 'format' (no value)
Comment 1 Felix Abecassis 2021-07-22 16:49:41 MDT
Note: using an explicit environment was an example, it doesn't work either with:
$ srun -A admin true
Comment 6 Marshall Garey 2021-07-23 15:58:35 MDT
sbatch sends the environment but slurmctld, but since salloc/srun communicate directly with the compute nodes they do not send the environment to slurmctld.

So your example job_submit.lua script should work with sbatch but not salloc/srun.

Here's another example I put together:

function slurm_job_submit(job_desc, part_list, submit_uid)
	if (job_desc.environment ~= nil) then
		if (job_desc.environment["FOO"] ~= nil) then
			slurm.log_info("Found env FOO=%s", job_desc.environment["FOO"])
		end
	end

	return slurm.SUCCESS
end


sbatch works:

FOO=bar sbatch --wrap='whereami'

slurmctld log:
[2021-07-23T15:51:37.983] lua: Found env FOO=bar


srun and salloc do not:

$ FOO=bar srun whereami

slurmctld log:
[2021-07-23T15:58:09.911] error: _job_env_field: job_desc->environment is NULL




I'll work on a patch to document this on our job_submit web page.
Comment 9 Marshall Garey 2021-07-23 16:11:52 MDT
Felix,

I assume that you would like to actually set environment variables for any job, not just sbatch jobs. In that case you could consider a CliFilter plugin. This was discussed in bug 9389 and Trey helpfully shared his cli_filter/lua plugin that sets environment variables:

https://bugs.schedmd.com/show_bug.cgi?id=9389#c22

The advantage of setting environment variables with CliFilter is that it runs on the node where salloc/sbatch/srun were called instead of the slurmctld node, so it frees up computing time for the slurmctld.

Let me know if CliFilter doesn't work for you.
Comment 10 Felix Abecassis 2021-07-23 16:14:35 MDT
Yes, we are currently using cli-filter too.

I started testing a new minor feature with the job_submit plugin, and was surprised it didn't work. I see that the reason is tricky.
Comment 11 Felix Abecassis 2021-07-23 16:19:23 MDT
So yes, not a big deal if it's not going to be fixed. Modifying the documentation will be enough and I will change my solution.
Comment 13 Marshall Garey 2021-07-27 08:03:42 MDT
Felix,

We've updated the documentation in commit a5993ef2e1706a80. This will be live on our website when 21.08 is released in August.

Thanks for the bug report. Let me know if there's anything else I can do.

Closing as infogiven.