Ticket 6034 - starting job in directory without write permission
Summary: starting job in directory without write permission
Status: OPEN
Alias: None
Product: Slurm
Classification: Unclassified
Component: User Commands (show other tickets)
Version: 18.08.1
Hardware: Linux Linux
: 5 - Enhancement
Assignee: Unassigned Developer
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2018-11-14 12:21 MST by George Hwa
Modified: 2022-04-13 13:07 MDT (History)
1 user (show)

See Also:
Site: KLA-Tencor RAPID
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description George Hwa 2018-11-14 12:21:19 MST
If a user submit a job(sbatch) from a directory in which he/she does NOT have write permission, the job obvious fails quickly and no output gets generated. This is obviously a user error. However, there is no clear indication to the user what the problem is. When the administrator is called upon to investigate, there is no simple command to tell that the problem is the job directory permission. The thing I have to do to help user troubleshoot is
  1. scontrol show job xxxx
or sacct -j xxxx
  to find out which node it ran on
  2. login to that node and search slurmd.log

is there a better way?

Thanks
George
Comment 4 Broderick Gardner 2018-11-19 17:01:16 MST
There is currently not a better way unfortunately. I am looking into what it would take to add an error message at submission time.
Comment 5 George Hwa 2018-11-19 17:19:30 MST
Thanks. that's would save admins a lot of time!
Comment 6 Broderick Gardner 2018-11-30 09:08:11 MST
So I have specced this out a bit. It is an enhancement, so I'm changing the severity level accordingly. It isn't feasible to check for write permissions at submission time, so it is likely that my patch will change the reason for job failure in scontrol show jobs to something about being unable to open a file for IO.
Comment 7 George Hwa 2018-11-30 09:42:32 MST
That would be much better than what I have to do now.