Summary: | Improve job failure reason on permissions error or missing directory | ||
---|---|---|---|
Product: | Slurm | Reporter: | Ali Nikkhah <alin4> |
Component: | Accounting | Assignee: | Dominik Bartkiewicz <bart> |
Status: | OPEN --- | QA Contact: | |
Severity: | 5 - Enhancement | ||
Priority: | --- | CC: | alin4, ihmesa |
Version: | 21.08.6 | ||
Hardware: | Linux | ||
OS: | Linux | ||
See Also: |
https://bugs.schedmd.com/show_bug.cgi?id=6034 https://bugs.schedmd.com/show_bug.cgi?id=14956 |
||
Site: | U WA Health Metrics | Alineos Sites: | --- |
Atos/Eviden Sites: | --- | Confidential Site: | --- |
Coreweave sites: | --- | Cray Sites: | --- |
DS9 clusters: | --- | HPCnow Sites: | --- |
HPE Sites: | --- | IBM Sites: | --- |
NOAA SIte: | --- | NoveTech Sites: | --- |
Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
Recursion Pharma Sites: | --- | SFW Sites: | --- |
SNIC sites: | --- | Tzag Elita Sites: | --- |
Linux Distro: | Ubuntu | Machine Name: | |
CLE Version: | Version Fixed: | ||
Target Release: | 23.11 | DevPrio: | --- |
Emory-Cloud Sites: | --- |
Description
Ali Nikkhah
2022-04-13 13:07:17 MDT
Hi Sorry that I didn't respond earlier. Unfortunately, this isn't something easy to solve. I am still looking for the best solution to solve this issue. Commit cfad2383bcc slightly change this behavior instead of 1:0 ExitCode is now 0:53. Signal 53 corresponds to real-time signals and should be unique. I will let you know when I find the right solution, but I am afraid that we have no time to include this in 22.05. Dominik Hi In 23.02 we add the possibility of automatically creating directories for stdout/stderr output files. Unfortunately in 23.02, we still didn't add any easy and user-available option to check if a job fails due to failure of opening stdout/stderr files. Could we drop the severity level of this issue to enhancement? Dominik Thanks- the automatic creation of stdout/stderr directories should help significantly. I think dropping this to enhancement level is fine. |