Ticket 16821

Summary: slurmstepd: Could not run user task_epilog access denied
Product: Slurm Reporter: Regine Gaudin <regine.gaudin>
Component: slurmstepdAssignee: Director of Support <support>
Status: RESOLVED DUPLICATE QA Contact:
Severity: 3 - Medium Impact    
Priority: ---    
Version: 22.05.7   
Hardware: Linux   
OS: Linux   
Site: CEA Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description Regine Gaudin 2023-05-25 04:18:57 MDT
Hi

User complain about error using --task-epilog, the error is 
srun -n 1  --task-epilog=$HOME/myepilog.sh -p rome hostname

I have replace the full path of my home by $HOME.

srun: job 2197056 queued and waiting for resources
srun: job 2197056 has been allocated resources
inti6011
slurmstepd-inti6011: error: Could not run user task_epilog [$HOME/myepilog.sh]: access denied
slurmstepd-inti6011: error: TaskEpilog failed status=-1

The reason is that the access test is done by root before becoming user in 
_run_script_as_user. As we are using rootsquashfs on our filesystems root 
can not access the user script file 

_run_script_as_user(const char *name, const char *path, stepd_step_rec_t *job,

int max_wait, char **env)

{

int status, rc, opt;

pid_t cpid;

struct exec_wait_info *ei;

xassert(env);

if (path == NULL || path[0] == '\0')

return 0;

debug("[job %u] attempting to run %s [%s]", job->step_id.job_id, name, path);

if (!_access(path, 5, job->uid, job->ngids, job->gids)) {

error("Could not run %s [%s]: access denied", name, path);

return -1;

}

and after

if (_become_user(job, &sprivs) < 0) {

                        error("run_script_as_user _become_user failed: %m");

                        /* child process, should not return */

                        exit(127);

}



_access is calling stat for which the result is with strace slurmstepd:
stat($HOME/myepilog.sh", 0x7ffecdcda370) = -1 EACCES (Permission denied)


Is _access call necessary ? execve will do the same but as user ....
Comment 1 Jason Booth 2023-05-25 06:07:35 MDT
Duplicate of bug#16820

*** This ticket has been marked as a duplicate of ticket 16820 ***