Ticket 6892

Summary: job_submit.lua script prints error message twice
Product: Slurm Reporter: Anthony DelSorbo <anthony.delsorbo>
Component: slurmctldAssignee: Nate Rini <nate>
Status: RESOLVED FIXED QA Contact:
Severity: 4 - Minor Issue    
Priority: ---    
Version: 18.08.7   
Hardware: Linux   
OS: Linux   
See Also: https://bugs.schedmd.com/show_bug.cgi?id=6142
https://bugs.schedmd.com/show_bug.cgi?id=6513
Site: NOAA Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: NESCC OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: Any CLE Version:
Version Fixed: 18.08.8 19.05 Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---
Attachments: job_submit.lua file

Description Anthony DelSorbo 2019-04-19 18:18:44 MDT
We're experiencing an issue where an error message is being printed twice.  Here's an example of a command crafted to purposely trigger the corresponding message.

[Anthony.DelSorbo@sfe01:10 ~]$ sbatch -A nesccmgmt -J dep-main -o %x.o%j --wrap="sleep 10; exit 1"
sbatch: error: Batch submit error:  Must specify either number of nodes or number of tasks!
sbatch: error: Batch submit error:  Must specify either number of nodes or number of tasks!
sbatch: error: Batch job submission failed: Unspecified error

I've restarted both the slurmctld daemon on the server and the slurmd daemon on the client.

Please assist us in understanding what's triggering this double output.

I'll upload our job_submit.lua script shortly

Thanks,

Tony.
Comment 1 Anthony DelSorbo 2019-04-20 13:42:28 MDT
Created attachment 9972 [details]
job_submit.lua file

I've attached the job_submit.lua file
Comment 10 Nate Rini 2019-04-24 13:01:41 MDT
Tony

We have replicated the issue and are working on a patch.

--Nate
Comment 11 Anthony DelSorbo 2019-04-24 13:58:31 MDT
Excellent! Thank you Nate!



On Wed, Apr 24, 2019 at 3:01 PM <bugs@schedmd.com> wrote:

> *Comment # 10 <https://bugs.schedmd.com/show_bug.cgi?id=6892#c10> on bug
> 6892 <https://bugs.schedmd.com/show_bug.cgi?id=6892> from Nate Rini
> <nate@schedmd.com> *
>
> Tony
>
> We have replicated the issue and are working on a patch.
>
> --Nate
>
> ------------------------------
> You are receiving this mail because:
>
>    - You reported the bug.
>
>
Comment 13 Nate Rini 2019-04-29 15:38:46 MDT
(In reply to Nate Rini from comment #10)
> We have replicated the issue and are working on a patch.

This patch has been applied to fix the issue:
https://github.com/SchedMD/slurm/commit/297a68806828fd1d6775ed1f4a480c30f3abb702

Please reply to this ticket if you have any issues or questions.
--Nate