Ticket 16820 - slurmstepd: Could not run user task_epilog access denied
Summary: slurmstepd: Could not run user task_epilog access denied
Status: RESOLVED FIXED
Alias: None
Product: Slurm
Classification: Unclassified
Component: slurmstepd (show other tickets)
Version: 22.05.7
Hardware: Linux Linux
: 3 - Medium Impact
Assignee: Megan Dahl
QA Contact:
URL:
: 16821 (view as ticket list)
Depends on:
Blocks:
 
Reported: 2023-05-25 04:12 MDT by Regine Gaudin
Modified: 2023-06-07 15:44 MDT (History)
0 users

See Also:
Site: CEA
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed: 23.02.3
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description Regine Gaudin 2023-05-25 04:12:45 MDT
User complain about error using --task-epilog, the error is 
srun -n 1  --task-epilog=$HOME/myepilog.sh -p rome hostname

I have replace the full path of my home by $HOME.

srun: job 2197056 queued and waiting for resources
srun: job 2197056 has been allocated resources
inti6011
slurmstepd-inti6011: error: Could not run user task_epilog [$HOME/myepilog.sh]: access denied
slurmstepd-inti6011: error: TaskEpilog failed status=-1

The reason is that the access test is done by root before becoming user in 
_run_script_as_user. As we are using rootsquashfs on our filesystems root 
can not access the user script file 

_run_script_as_user(const char *name, const char *path, stepd_step_rec_t *job,

int max_wait, char **env)

{

int status, rc, opt;

pid_t cpid;

struct exec_wait_info *ei;

xassert(env);

if (path == NULL || path[0] == '\0')

return 0;

debug("[job %u] attempting to run %s [%s]", job->step_id.job_id, name, path);

if (!_access(path, 5, job->uid, job->ngids, job->gids)) {

error("Could not run %s [%s]: access denied", name, path);

return -1;

}

and after

if (_become_user(job, &sprivs) < 0) {

                        error("run_script_as_user _become_user failed: %m");

                        /* child process, should not return */

                        exit(127);

}



_access is calling stat for which the result is with strace slurmstepd:
stat($HOME/myepilog.sh", 0x7ffecdcda370) = -1 EACCES (Permission denied)


Is _access call necessary ? execve will do the same but as user ....
Comment 1 Jason Booth 2023-05-25 06:07:35 MDT
*** Ticket 16821 has been marked as a duplicate of this ticket. ***
Comment 2 Jason Booth 2023-05-25 10:38:36 MDT
This should run as the user. Would you verify if "myepilog.sh" has the execution bit set on it?

Without the execution bit set:

>:~/slurm/23.02$ chmod -x myepilog.sh 
>:~/slurm/23.02$ srun -n 1  --task-epilog=myepilog.sh  hostname
>srun: Max_Nodes 4294967294
>srun: Min_Cpu 1
>nh-grey
>slurmstepd-n1: error: Could not run user task_epilog [myepilog.sh]: access denied
>slurmstepd-n1: error: TaskEpilog failed status=-1

With the execution bit set.
>:~/slurm/23.02$ chmod +x myepilog.sh 
>:~/slurm/23.02$ srun -n 1  --task-epilog=myepilog.sh  hostname
>srun: Max_Nodes 4294967294
>srun: Min_Cpu 1
>nh-grey
Comment 4 Regine Gaudin 2023-06-01 01:45:57 MDT
The executing +x right was the first thing I 'd verified 
Even with 777 I have the same error

Once again according to _run_script_as_user source the access test is done by root, and rootsquash does not allow root to access to the epilog.


chmod -x myepilog.sh
srun -n 1  --task-epilog=myepilog.sh -p rome hostname
srun: job 2262502 queued and waiting for resources
srun: job 2262502 has been allocated resources
inti6004
slurmstepd-inti6004: error: Could not run user task_epilog [myepilog.sh]: access denied
slurmstepd-inti6004: error: TaskEpilog failed status=-1

[gaudinr@login1 gaudinr] $ chmod +x myepilog.sh
[gaudinr@login1 gaudinr] $ ls -altr myepilog.sh 
-rwxr-x--- 1 gaudinr mygroup 36 May 23 15:46 myepilog.sh
[gaudinr@login1 gaudinr] $ srun -n 1  --task-epilog=myepilog.sh  -p rome hostname
srun: job 2262511 queued and waiting for resources
srun: job 2262511 has been allocated resources
inti6004
slurmstepd-inti6004: error: Could not run user task_epilog [myepilog.sh]: access denied
slurmstepd-inti6004: error: TaskEpilog failed status=-1
Comment 5 Megan Dahl 2023-06-02 15:04:41 MDT
Hi Regine,

Yes you are correct that the stat() call in _access() is done by root. From what I can tell the access call is there for easier error handling. However, it probably can be moved to be after the _become_user() call. I’ll just need to double check that.
Comment 10 Regine Gaudin 2023-06-06 08:12:03 MDT
However, it probably can be moved to be after the _become_user() call. 

I've tried however I'm wondering the utility of the stat because the execve will check the access also

I have move the access call after the become_user and chdir and it's working with
diff -rau slurm-22.05.7.orig/src/slurmd/slurmstepd/mgr.c slurm-22.05.7.work/src/slurmd/slurmstepd/mgr.c
--- slurm-22.05.7.orig/src/slurmd/slurmstepd/mgr.c      2023-05-25 11:11:37.867828081 +0200
+++ slurm-22.05.7.work/src/slurmd/slurmstepd/mgr.c      2023-06-06 16:01:55.730722847 +0200
@@ -2876,12 +2876,6 @@
        if (path == NULL || path[0] == '\0')
                return 0;

-       debug("[job %u] attempting to run %s [%s]", job->step_id.job_id, name, path);
-
-       if (!_access(path, 5, job->uid, job->ngids, job->gids)) {
-               error("Could not run %s [%s]: access denied", name, path);
-               return -1;
-       }

        if ((ei = _fork_child_with_wait_info(0)) == NULL) {
                error ("executing %s: fork: %m", name);
@@ -2940,9 +2934,18 @@
                        exit(127);
                }

+
                if (chdir(job->cwd) == -1)
                        error("run_script_as_user: couldn't "
                              "change working dir to %s: %m", job->cwd);
+
+               debug("[job %u] attempting to run %s [%s]", job->step_id.job_id, name, path);
+
+               if (!_access(path, 5, job->uid, job->ngids, job->gids)) {
+                  error("Could not run %s [%s]: access denied", name, path);
+                  return -1;
+               }
+
                setpgid(0, 0);


 $ srun --task-epilog=myepilog.sh  -p a100-bxi hostname
inti7802


But if I put the access before after the become_user but after the if (chdir(job->cwd) == -1)
the job fails with the following

 srun --task-epilog=myepilog.sh  -p a100-bxi hostname
inti7802
slurmstepd-inti7802: error: Could not run user task_epilog [myepilog.sh]: access denied
slurmstepd-inti7802: error: TaskEpilog failed status=-1
slurmstepd-inti7802: error: common_file_write_content: unable to open '/dev/cgroup/freezer/slurm_inti7802/uid_3141/job_1042076/step_0/freezer.state' for writing: Permission denied
srun: error: eio_handle_mainloop: Abandoning IO 60 secs after job shutdown initiated
Comment 11 Regine Gaudin 2023-06-06 08:13:41 MDT
Note that problem of root access is due to the use of squashfs but will be encountered also with more and more widespread containerized job
Comment 12 Regine Gaudin 2023-06-06 08:39:09 MDT
I meant : "But if I put the access after the become_user but before the if (chdir(job->cwd) == -1) the job fails
Comment 14 Megan Dahl 2023-06-06 13:47:50 MDT
It is strange that the chdir() has to be called first. You are correct that the stat() call is not really needed. In the patch that I am working on the _access() call is removed and the error handling is moved to the execve().

There is a slight issue with your current fix. The child process should not call return since that will result in two slurmstepds running.
Comment 18 Megan Dahl 2023-06-07 15:44:38 MDT
The permission denied error for epilog tasks when root_squash is set has now been resolved. _run_script_as_user() no longer calls _access() which ran stat() as root. The access check now is handled by execve() which runs as the user.

commit 5328f61118b1bae2cac353aa85e7adfba13e52f1
Author:     Megan Dahl <megan@schedmd.com>
AuthorDate: Tue Jun 6 14:50:49 2023 -0600

    Fix permission denied for prolog and epilog tasks with root_squash.
    
    Removed the _access() call in _run_script_as_user() so that the access
    check is done by execve() instead.
    
    This avoids having _access() throw a permission denied error when
    root_squash is set as we have yet to setuid() into the user.
    
    Bug 16820

This change will be available in 23.02.3.