Ticket 3043 - salloc returns immediately upon allocation when nodes are still configuring
Summary: salloc returns immediately upon allocation when nodes are still configuring
Status: RESOLVED FIXED
Alias: None
Product: Slurm
Classification: Unclassified
Component: KNL (show other tickets)
Version: 16.05.4
Hardware: Cray XC Linux
: 4 - Minor Issue
Assignee: Moe Jette
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2016-09-01 15:58 MDT by Doug Jacobsen
Modified: 2016-09-06 12:10 MDT (History)
1 user (show)

See Also:
Site: NERSC
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed: 16.05.5
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description Doug Jacobsen 2016-09-01 15:58:30 MDT
Hello,

I'm seeing salloc unblock long before an allocation is usable in the case that the allocation needs to configure first:

dmj@login:~> salloc -p debug_knl -C quad,flat -N 10 -t 1:00:00 /bin/bash
salloc: Granted job allocation 15
salloc: Waiting for resource configuration
salloc: Nodes nid00[320-329] are ready for job
dmj@login:~> squeue -j $SLURM_JOB_ID
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
                15 debug_knl     bash      dmj CF       0:29     10 nid00[320-329]
dmj@login:~>

I think it should block until the allocation is ready for the job, to reduce confusion about when the user can access the allocation.  As an aside, I think it would be nice if interactive allocations like this informed the user that node reconfiguration was happening, e.g.:

...
salloc: Granted job allocation 15.
salloc: Reconfiguring nodes nid00[320-329] to quad,flat
salloc: Waiting for resource configuration
<pause until configuration complete>
salloc: Nodes nid00[320-329] are ready for job
...



Thanks,
Doug

-Doug
Comment 1 Moe Jette 2016-09-01 16:38:48 MDT
salloc is designed continue while the nodes are booting. I will investigate the salloc log showing nodes being ready, but squeue showing nodes configuring, which is certainly an inconsistency, but I think you want to use this salloc option:

 
--wait-all-nodes=<value>
              Controls when the execution of the command begins.  By default
              the job will begin  execution
              as soon as the allocation is made.

              0    Begin  execution  as  soon  as allocation can be made.
                   Do not wait for all nodes to be
                   ready for use (i.e. booted).

              1    Do not begin execution until all nodes are ready for use.
Comment 2 Doug Jacobsen 2016-09-02 08:13:21 MDT
Great, that makes sense.  Is there a way to make `--wait-all-nodes=1` a
default behavior from slurm.conf? I suppose we can always set
SALLOC_WAIT_ALL_NODES=1 in the default environment.

Thanks!
Doug

----
Doug Jacobsen, Ph.D.
NERSC Computer Systems Engineer
National Energy Research Scientific Computing Center <http://www.nersc.gov>
dmjacobsen@lbl.gov

------------- __o
---------- _ '\<,_
----------(_)/  (_)__________________________


On Thu, Sep 1, 2016 at 3:38 PM, <bugs@schedmd.com> wrote:

> *Comment # 1 <https://bugs.schedmd.com/show_bug.cgi?id=3043#c1> on bug
> 3043 <https://bugs.schedmd.com/show_bug.cgi?id=3043> from Moe Jette
> <jette@schedmd.com> *
>
> salloc is designed continue while the nodes are booting. I will investigate the
> salloc log showing nodes being ready, but squeue showing nodes configuring,
> which is certainly an inconsistency, but I think you want to use this salloc
> option:
>
>
> --wait-all-nodes=<value>
>               Controls when the execution of the command begins.  By default
>               the job will begin  execution
>               as soon as the allocation is made.
>
>               0    Begin  execution  as  soon  as allocation can be made.
>                    Do not wait for all nodes to be
>                    ready for use (i.e. booted).
>
>               1    Do not begin execution until all nodes are ready for use.
>
> ------------------------------
> You are receiving this mail because:
>
>    - You reported the bug.
>
>
Comment 3 Moe Jette 2016-09-02 08:23:37 MDT
(In reply to Doug Jacobsen from comment #2)
> Great, that makes sense.  Is there a way to make `--wait-all-nodes=1` a
> default behavior from slurm.conf? I suppose we can always set
> SALLOC_WAIT_ALL_NODES=1 in the default environment.

The command line option or an environment variable are your only options today, but it will be easy to add a configuration parameter to make that the default behaviour. I'll get that to you soon.
Comment 4 Moe Jette 2016-09-06 12:10:26 MDT
I just added an option for this capability in Slurm version 16.05.5. Once you install that (or get the patch if you are anxious), just add the "salloc_wait_nodes" option to the SchedulerParameters parameter in the slurm.conf and that will cause salloc to wait for node boot completion by default. The salloc option of "--wait-all-nodes=0" would override that. The commit is here:
https://github.com/SchedMD/slurm/commit/2670edc47c9ed715f52fcf3144e301fc9ee6b4b5