Ticket 1220 - better error reporting
Summary: better error reporting
Status: RESOLVED FIXED
Alias: None
Product: Slurm
Classification: Unclassified
Component: Scheduling (show other tickets)
Version: 14.03.9
Hardware: Linux Linux
: 4 - Minor Issue
Assignee: Brian Christiansen
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2014-11-02 11:55 MST by Stuart Midgley
Modified: 2014-11-03 10:47 MST (History)
2 users (show)

See Also:
Site: DownUnder GeoSolutions
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed: 15.08.0pre1
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments
Proposed patch. (9.81 KB, patch)
2014-11-03 09:56 MST, Brian Christiansen
Details | Diff

Note You need to log in before you can comment on or make changes to this ticket.
Description Stuart Midgley 2014-11-02 11:55:55 MST
20141103095435 bud30:~> salloc -p idle,idleProc,teamProd,teamdev --mem=80000 --nice=0 -c 8 
salloc: error: Job submit/allocate failed: Invalid partition name specified


It would be good if it actually told me which partition was the problem...
Comment 1 Moe Jette 2014-11-03 07:38:40 MST
This is for Brian (or whoever works on this):
The job_allocate() function in job_mgr.c takes an argument of "char **err_msg" and this can include whatever details we want to pass back to the user about why his job was rejected.
Comment 2 Brian Christiansen 2014-11-03 09:56:49 MST
Created attachment 1391 [details]
Proposed patch.
Comment 3 Brian Christiansen 2014-11-03 09:58:53 MST
The proposed patch gives the following behavior:

brian@compy:~/slurm/master2/compy$ sbatch -p debugg ~/jobs/sleep.sh 
sbatch: error: invalid partition specified: debugg
sbatch: error: Batch job submission failed: Invalid partition name specified

brian@compy:~/slurm/master2/compy$ sbatch -p debug,debugg ~/jobs/sleep.sh 
sbatch: error: invalid partition specified: debugg
sbatch: error: Batch job submission failed: Invalid partition name specified

brian@compy:~/slurm/master2/compy$ sbatch -p debug,debug1,debugg ~/jobs/sleep.sh 
sbatch: error: invalid partition specified: debugg
sbatch: error: Batch job submission failed: Invalid partition name specified

brian@compy:~/slurm/master2/compy$ sbatch -p debug,debugg,debug1 ~/jobs/sleep.sh 
sbatch: error: invalid partition specified: debugg
sbatch: error: Batch job submission failed: Invalid partition name specified

brian@compy:~/slurm/master2/compy$ sbatch -p debug,debug2,debug1 ~/jobs/sleep.sh 
Submitted batch job 10
Comment 4 Brian Christiansen 2014-11-03 10:47:38 MST
This is fixed in the following commit in 15.08.0pre1:

https://github.com/SchedMD/slurm/commit/b0772490e243d7bba4801ad8653aa6be2ad178b4

Thanks,
Brian