Ticket 5764

Summary: nodes drain when SLURM_MPI_TYPE is exported
Product: Slurm Reporter: krd103
Component: SchedulingAssignee: Director of Support <support>
Status: RESOLVED WONTFIX QA Contact:
Severity: 4 - Minor Issue    
Priority: ---    
Version: - Unsupported Older Versions   
Hardware: Linux   
OS: Linux   
Site: OCF Slinky Site: ---
Alineos Sites: --- Atos/Eviden Sites: ---
Confidential Site: --- Coreweave sites: ---
Cray Sites: --- DS9 clusters: ---
Google sites: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- NoveTech Sites: ---
Nvidia HWinf-CS Sites: --- OCF Sites: Southampton University
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Tzag Elita Sites: ---
Linux Distro: --- Machine Name:
CLE Version: Version Fixed:
Target Release: --- DevPrio: ---
Emory-Cloud Sites: ---

Description krd103 2018-09-21 08:37:59 MDT
Whilst trying to debug an mpi issue I tried to run a job with the arguments

sbatch --export=SLURM_MPI_TYPE=ibmmpi my_job.sh

rather than failing due to an incorrect mpi type, this caused the node to drain with the reason "batch job complete failure".  This error seems to be reproducible for anything not listed by srun --mpi=list
Comment 2 Jason Booth 2018-09-21 14:03:26 MDT
Greetings, 

 I was not able to find you on our list of supported contacts. Unfortunately, these requests need to be filed through David Baker or Chris Hardacre at OCF.

-Jason
Comment 3 krd103 2018-09-24 01:56:18 MDT
Hi David,

Are you able to re-report the following bug?

Regards,

Keith

From: bugs@schedmd.com <bugs@schedmd.com>
Sent: 21 September 2018 21:03
To: Daly K.R. <krd103@soton.ac.uk>
Subject: [Bug 5764] nodes drain when SLURM_MPI_TYPE is exported

Jason Booth<mailto:jbooth@schedmd.com> changed bug 5764<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.schedmd.com%2Fshow_bug.cgi%3Fid%3D5764&data=01%7C01%7Ckrd103%40soton.ac.uk%7C558c736e194e449f97db08d61ffd4c3a%7C4a5378f929f44d3ebe89669d03ada9d8%7C1&sdata=1TQYe%2FtgQv0vYe28ITYLVtum2H54f6%2FPsXrQbrhbfLw%3D&reserved=0>
What

Removed

Added

Status

UNCONFIRMED

RESOLVED

Resolution

---

WONTFIX

Comment # 2<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.schedmd.com%2Fshow_bug.cgi%3Fid%3D5764%23c2&data=01%7C01%7Ckrd103%40soton.ac.uk%7C558c736e194e449f97db08d61ffd4c3a%7C4a5378f929f44d3ebe89669d03ada9d8%7C1&sdata=NYUy9lxJbhkSOS86ghlZUslajhHy2ezxXzML1xdXi4A%3D&reserved=0> on bug 5764<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.schedmd.com%2Fshow_bug.cgi%3Fid%3D5764&data=01%7C01%7Ckrd103%40soton.ac.uk%7C558c736e194e449f97db08d61ffd4c3a%7C4a5378f929f44d3ebe89669d03ada9d8%7C1&sdata=1TQYe%2FtgQv0vYe28ITYLVtum2H54f6%2FPsXrQbrhbfLw%3D&reserved=0> from Jason Booth<mailto:jbooth@schedmd.com>

Greetings,



 I was not able to find you on our list of supported contacts. Unfortunately,

these requests need to be filed through David Baker or Chris Hardacre at OCF.



-Jason

________________________________
You are receiving this mail because:

  *   You reported the bug.