| Summary: | slurmstepd: Could not run user task_epilog access denied | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Regine Gaudin <regine.gaudin> |
| Component: | slurmstepd | Assignee: | Director of Support <support> |
| Status: | RESOLVED DUPLICATE | QA Contact: | |
| Severity: | 3 - Medium Impact | ||
| Priority: | --- | ||
| Version: | 22.05.7 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | CEA | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | Target Release: | --- | |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
Duplicate of bug#16820 *** This ticket has been marked as a duplicate of ticket 16820 *** |
Hi User complain about error using --task-epilog, the error is srun -n 1 --task-epilog=$HOME/myepilog.sh -p rome hostname I have replace the full path of my home by $HOME. srun: job 2197056 queued and waiting for resources srun: job 2197056 has been allocated resources inti6011 slurmstepd-inti6011: error: Could not run user task_epilog [$HOME/myepilog.sh]: access denied slurmstepd-inti6011: error: TaskEpilog failed status=-1 The reason is that the access test is done by root before becoming user in _run_script_as_user. As we are using rootsquashfs on our filesystems root can not access the user script file _run_script_as_user(const char *name, const char *path, stepd_step_rec_t *job, int max_wait, char **env) { int status, rc, opt; pid_t cpid; struct exec_wait_info *ei; xassert(env); if (path == NULL || path[0] == '\0') return 0; debug("[job %u] attempting to run %s [%s]", job->step_id.job_id, name, path); if (!_access(path, 5, job->uid, job->ngids, job->gids)) { error("Could not run %s [%s]: access denied", name, path); return -1; } and after if (_become_user(job, &sprivs) < 0) { error("run_script_as_user _become_user failed: %m"); /* child process, should not return */ exit(127); } _access is calling stat for which the result is with strace slurmstepd: stat($HOME/myepilog.sh", 0x7ffecdcda370) = -1 EACCES (Permission denied) Is _access call necessary ? execve will do the same but as user ....