Ticket 9528 - mpiexec does not propagate exit code
Summary: mpiexec does not propagate exit code
Status: RESOLVED FIXED
Alias: None
Product: Slurm
Classification: Unclassified
Component: User Commands (show other tickets)
Version: 21.08.x
Hardware: Linux Linux
: C - Contributions
Assignee: Tim Wickberg
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2020-08-06 11:20 MDT by Simon Byrne
Modified: 2023-06-27 10:51 MDT (History)
2 users (show)

See Also:
Site: -Other-
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed: 23.11.0rc1
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments
Patch to correctly propagate exit code (385 bytes, patch)
2020-08-07 12:14 MDT, Simon Byrne
Details | Diff

Note You need to log in before you can comment on or make changes to this ticket.
Description Simon Byrne 2020-08-06 11:20:05 MDT
$ mpiexec bash -c "exit 1" 

$ echo $? 
0

I'm not that familiar with perl, but it looks like the return value for the system command here:
https://github.com/SchedMD/slurm/blob/b1656169f48a73c1229560d8d19a902e1956aac7/contribs/torque/mpiexec.pl#L165
should be captured (https://perldoc.perl.org/functions/system.html) so something like

my $exit_code = system($command);

system("rm -f $new_config") if($new_config);

exit($exit_code >> 8);
Comment 3 Simon Byrne 2020-08-07 12:14:32 MDT
Created attachment 15353 [details]
Patch to correctly propagate exit code
Comment 4 Richard Berger 2023-06-26 17:41:17 MDT
Are there any plans to fix this issue? We've just replicated that workaround for us locally. But it would be nice if it gets into a future release.
Comment 5 Tim Wickberg 2023-06-27 10:51:43 MDT
Simon -

Thanks for the submission. It's finally upstream, and will be included in the 23.11 release later this fall. Commit details follow for reference.

- Tim

commit 5ab128ca079102305be383a1ae23c7a87543150b
Author:     Simon Byrne <simonbyrne@gmail.com>
AuthorDate: Tue Jun 27 10:43:52 2023 -0600

    torque/mpiexec - Propagate exit code from launched process.
    
    Bug 9528.