Ticket 1096 - Need SALLOC_HINT environment variable
Summary: Need SALLOC_HINT environment variable
Status: RESOLVED FIXED
Alias: None
Product: Slurm
Classification: Unclassified
Component: Scheduling (show other tickets)
Version: 14.03.3
Hardware: Linux Linux
: 4 - Minor Issue
Assignee: Danny Auble
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2014-09-10 02:31 MDT by Mark Shry
Modified: 2014-09-12 04:10 MDT (History)
2 users (show)

See Also:
Site: CRAY
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed: 14.03.8 14.11.0pre5
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description Mark Shry 2014-09-10 02:31:31 MDT
Is it possible to get a SALLOC_HINT environmental variable so that we can setup salloc to be default "nomultithread". I know this issue was addressed for sbatch in bug 1052. We need something similar for salloc.

Current default(without --hint) behavior:

> kmshry@clogin73:~> salloc -n256 -c3 srun hostname | sort -n | uniq -c
> salloc: Pending job allocation 6644
> salloc: job 6644 queued and waiting for resources
> salloc: job 6644 has been allocated resources
> salloc: Granted job allocation 6644
> salloc: Relinquishing job allocation 6644
> salloc: Job allocation 6644 has been revoked.
>      16 nid00072
>      16 nid00073
>      16 nid00074
>      16 nid00075
>      16 nid00076
>      16 nid00077
>      16 nid00078
>      16 nid00079
>      16 nid00080
>      16 nid00081
>      16 nid00082
>      16 nid00083
>      16 nid00084
>      16 nid00085
>      16 nid00086
>      16 nid00087

Current behavior with hint:

> kmshry@clogin73:~> salloc -n256 -c3 --hint=nomultithread srun hostname | sort -n | uniq -c
> salloc: Pending job allocation 6646
> salloc: job 6646 queued and waiting for resources
> salloc: job 6646 has been allocated resources
> salloc: Granted job allocation 6646
> salloc: Relinquishing job allocation 6646
> salloc: Job allocation 6646 has been revoked.
>       8 nid00072
>       8 nid00073
>       8 nid00074
>       8 nid00075
>       8 nid00076
>       8 nid00077
>       8 nid00078
>       8 nid00079
>       8 nid00080
>       8 nid00081
>       8 nid00082
>       8 nid00083
>       8 nid00084
>       8 nid00085
>       8 nid00086
>       8 nid00087
>       8 nid00088
>       8 nid00089
>       8 nid00090
>       8 nid00091
>       8 nid00092
>       8 nid00093
>       8 nid00094
>       8 nid00095
>       8 nid00096
>       8 nid00097
>       8 nid00098
>       8 nid00099
>       8 nid00100
>       8 nid00101
>       8 nid00102
>       8 nid00103
Comment 1 Danny Auble 2014-09-10 04:57:06 MDT
I'll work on it now.  I thought it was done at the same time, but it appears to be missing from the current code.
Comment 2 Danny Auble 2014-09-10 05:26:40 MDT
This is in commit b1ad21dabd9a692f223030d8e8046619d64467e7

Using SLURM_HINT will work in all cases.  As well as SBATCH|SALLOC_HINT for the respective processes.
Comment 3 Mark Shry 2014-09-12 04:10:51 MDT
This patch was installed on 9/11/2014. This issue is resolved.

Thanks