Ticket 3307

Summary: All nodes which are allocated for this job are already filled
Product: Slurm Reporter: Sanjaya Gajurel <sxg125>
Component: SchedulingAssignee: Tim Wickberg <tim>
Status: RESOLVED INFOGIVEN QA Contact:
Severity: 3 - Medium Impact    
Priority: ---    
Version: - Unsupported Older Versions   
Hardware: Linux   
OS: Linux   
Site: Case Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description Sanjaya Gajurel 2016-12-01 11:58:43 MST
Hi,

This is the issues our faculty is reporting. We would appreciate your help.

-------------
I'm bumping into a new problem under SLURM that did not exist under the previous PBS  TORQUE. I know that because I've run the exact same code and requested the exact same configuration of nodes under SLURM as before, yet with a different outcome. I also can tell because my CRAN package MVR does not work anymore on the cluster (I know it works because it has a OK on all platforms and was accepted by the CRAN reviewers: see here).

My script contains R functions that each is designed to configure a parallel backend, run the parallel commands, and then closes it when it is done (as it should do). FYI, I use the CRAN distribution package 'parallel' and its functions makeCluster() and stopCluster() to do that.  What happens is that SLURM does not let you run that job anymore: once the first function is finished and it gets to the second function, I get the following message:

--------------------------------------------------------------------------
All nodes which are allocated for this job are already filled.
--------------------------------------------------------------------------

The job is not killed, though, but it is stalled (see for instance job #2459505, currently stalled)

This is potentially a serious problem because as a result some CRAN packages, like mine (MVR, PRIMsrc) will also not work anymore. Is there a way to configure the cluster, or to add a SLURM script, to allow the re-configuration of a parallel backend cluster within the same job?
-------------------------------------------------------

Thanks

-Sanjaya
Comment 1 Tim Wickberg 2016-12-01 12:11:56 MST
Do you have some examples of the scripts, and the underlying calls its trying to make to Slurm?

There's not much for me to work off based on that description. I have no idea what makeCluster() and stopCluster() would be doing, and it sounds like "All nodes which are allocated for this job are already filled." is an OpenMPI error message. All of that is outside the scope of our support model.
Comment 2 Sanjaya Gajurel 2016-12-02 15:07:20 MST
Hi Tim,

Here is the R script for your reference.

------------------------------

library("parallel")

if (.Platform$OS.type == "unix") {
    if (require("Rmpi")) {
        print("Rmpi is loaded correctly \n")
    } else {
        stop("Rmpi must be installed first \n")
    }
}

# =================#
# Setting working directory #
# =================#
setwd(dir=file.path(Sys.getenv("HOME"), "/CODES/R/ADMIN/Parallel/Slurm",
fsep=.Platform$file.sep))

# =================#
# Retrieving argument passed from the command line #
# =================#
args <- commandArgs(trailingOnly=TRUE)

# =================#
# Cluster configuration #
# =================#
if (.Platform$OS.type == "unix") {
   conf <- list("cpus"=args[1],
                "type"="MPI",
                "homo"=TRUE,
                "verbose"=TRUE,
                "outfile"=paste(getwd(), "/output.txt", sep=""))
}

# =================#
# Data, Procedures #
# =================#
n <- 1e7
tasks <- list(1:n, 1:n, 1:n, 1:n, 1:n, 1:n, 1:n, 1:n)

mymean <- function(x) {
    return(mean(cos(exp(sin(x)))))
}

foo <- function (conf, tasks, fun) {
   # Setting the cluster up
   if (conf$type == "SOCK") {
      clus.rep <- parallel::makeCluster(spec=conf$names,
                              type=conf$type,
                              homogeneous=conf$homo,
                              outfile=conf$outfile,
                              verbose=conf$verbose)
   } else {
      clus.rep <- parallel::makeCluster(spec=conf$cpus,
                              type=conf$type,
                              homogeneous=conf$homo,
                              outfile=conf$outfile,
                              verbose=conf$verbose)
   }
   # Running the tasks in parallel
   parallel::clusterApplyLB(cl=clus.rep, x=tasks, fun=fun)
   # Stopping the cluster
   parallel::stopCluster(cl=clus.rep)
}

# =================#
# Main #
# =================#
for (b in 1:3) {
   cat("Replicate:", b, "\n")
   foo(conf=conf, tasks=tasks, fun=mymean)
}


On Thu, Dec 1, 2016 at 2:11 PM, <bugs@schedmd.com> wrote:

> Tim Wickberg <tim@schedmd.com> changed bug 3307
> <https://bugs.schedmd.com/show_bug.cgi?id=3307>
> What Removed Added
> Assignee support@schedmd.com tim@schedmd.com
>
> *Comment # 1 <https://bugs.schedmd.com/show_bug.cgi?id=3307#c1> on bug
> 3307 <https://bugs.schedmd.com/show_bug.cgi?id=3307> from Tim Wickberg
> <tim@schedmd.com> *
>
> Do you have some examples of the scripts, and the underlying calls its trying
> to make to Slurm?
>
> There's not much for me to work off based on that description. I have no idea
> what makeCluster() and stopCluster() would be doing, and it sounds like "All
> nodes which are allocated for this job are already filled." is an OpenMPI error
> message. All of that is outside the scope of our support model.
>
> ------------------------------
> You are receiving this mail because:
>
>    - You reported the bug.
>
>
Comment 3 Sanjaya Gajurel 2016-12-07 09:23:32 MST
Hi Tim,

I replicated his job in both SLURM and PBS cluster. In PBS Cluster, it is
running as expected. However, to make it run in the SLURM cluster, I had to
switch from makeCluster() function to makeForkCluster().

We would appreciate your help if you could explain us about this
discrepancies between PBS and SLURM.

Thank you,

-Sanaya

On Fri, Dec 2, 2016 at 5:06 PM, Sanjaya Gajurel <sxg125@case.edu> wrote:

> Hi Tim,
>
> Here is the R script for your reference.
>
> ------------------------------
>
> library("parallel")
>
> if (.Platform$OS.type == "unix") {
>     if (require("Rmpi")) {
>         print("Rmpi is loaded correctly \n")
>     } else {
>         stop("Rmpi must be installed first \n")
>     }
> }
>
> # =================#
> # Setting working directory #
> # =================#
> setwd(dir=file.path(Sys.getenv("HOME"), "/CODES/R/ADMIN/Parallel/Slurm",
> fsep=.Platform$file.sep))
>
> # =================#
> # Retrieving argument passed from the command line #
> # =================#
> args <- commandArgs(trailingOnly=TRUE)
>
> # =================#
> # Cluster configuration #
> # =================#
> if (.Platform$OS.type == "unix") {
>    conf <- list("cpus"=args[1],
>                 "type"="MPI",
>                 "homo"=TRUE,
>                 "verbose"=TRUE,
>                 "outfile"=paste(getwd(), "/output.txt", sep=""))
> }
>
> # =================#
> # Data, Procedures #
> # =================#
> n <- 1e7
> tasks <- list(1:n, 1:n, 1:n, 1:n, 1:n, 1:n, 1:n, 1:n)
>
> mymean <- function(x) {
>     return(mean(cos(exp(sin(x)))))
> }
>
> foo <- function (conf, tasks, fun) {
>    # Setting the cluster up
>    if (conf$type == "SOCK") {
>       clus.rep <- parallel::makeCluster(spec=conf$names,
>                               type=conf$type,
>                               homogeneous=conf$homo,
>                               outfile=conf$outfile,
>                               verbose=conf$verbose)
>    } else {
>       clus.rep <- parallel::makeCluster(spec=conf$cpus,
>                               type=conf$type,
>                               homogeneous=conf$homo,
>                               outfile=conf$outfile,
>                               verbose=conf$verbose)
>    }
>    # Running the tasks in parallel
>    parallel::clusterApplyLB(cl=clus.rep, x=tasks, fun=fun)
>    # Stopping the cluster
>    parallel::stopCluster(cl=clus.rep)
> }
>
> # =================#
> # Main #
> # =================#
> for (b in 1:3) {
>    cat("Replicate:", b, "\n")
>    foo(conf=conf, tasks=tasks, fun=mymean)
> }
>
>
> On Thu, Dec 1, 2016 at 2:11 PM, <bugs@schedmd.com> wrote:
>
>> Tim Wickberg <tim@schedmd.com> changed bug 3307
>> <https://bugs.schedmd.com/show_bug.cgi?id=3307>
>> What Removed Added
>> Assignee support@schedmd.com tim@schedmd.com
>>
>> *Comment # 1 <https://bugs.schedmd.com/show_bug.cgi?id=3307#c1> on bug
>> 3307 <https://bugs.schedmd.com/show_bug.cgi?id=3307> from Tim Wickberg
>> <tim@schedmd.com> *
>>
>> Do you have some examples of the scripts, and the underlying calls its trying
>> to make to Slurm?
>>
>> There's not much for me to work off based on that description. I have no idea
>> what makeCluster() and stopCluster() would be doing, and it sounds like "All
>> nodes which are allocated for this job are already filled." is an OpenMPI error
>> message. All of that is outside the scope of our support model.
>>
>> ------------------------------
>> You are receiving this mail because:
>>
>>    - You reported the bug.
>>
>>
>
>
> --
> ========================
> Sanjaya Gajurel, Ph.D.
> Computational Scientist
> sxg125@case.edu
> Research Computing & Cyber Infrastructure (RCCI)
> 216-368-5717 <(216)%20368-5717> (office)
> 216-315-4136 <(216)%20315-4136> (cell)
> Crawford 512
> Case Western Reserve University
> 10900 Euclid Ave
> Cleveland, OH 44106
> =========================
>
Comment 4 Tim Wickberg 2016-12-07 09:28:06 MST
(In reply to Sanjaya Gajurel from comment #3)
> Hi Tim,
> 
> I replicated his job in both SLURM and PBS cluster. In PBS Cluster, it is
> running as expected. However, to make it run in the SLURM cluster, I had to
> switch from makeCluster() function to makeForkCluster().
> 
> We would appreciate your help if you could explain us about this
> discrepancies between PBS and SLURM.

I've quickly looked at the script, and I have no idea how it's interacting with the Slurm resource request. Unfortunately, none of us have an experience with the R parallel package, and can't help on this. (This does not fall under our L3 support model for Slurm, there's no apparent problem with Slurm here.)

If you're able to translate this into ways its interacting with Slurm itself, I can provide some assistance there. But it sounds like you may have found a solution already.
Comment 5 Tim Wickberg 2016-12-09 12:08:12 MST
Marking resolved/infogiven as I believe you'd found a workaround. Please reopen if you have further questions on this issue.

- Tim
Comment 6 Sanjaya Gajurel 2016-12-09 12:14:53 MST
Hi Tim,

Yes, you can close the ticket. We are still investigating it. The good
thing is that makeForkCluster is working for both Slurm and PBS cluster.

I have not yet got response for *bug 3304
<https://bugs.schedmd.com/show_bug.cgi?id=3304> *after I sent you the
slurm.conf file. I would appreciate your response.

Thanks,

-Sanjaya

On Fri, Dec 9, 2016 at 2:08 PM, <bugs@schedmd.com> wrote:

> Tim Wickberg <tim@schedmd.com> changed bug 3307
> <https://bugs.schedmd.com/show_bug.cgi?id=3307>
> What Removed Added
> Status UNCONFIRMED RESOLVED
> Resolution --- INFOGIVEN
>
> *Comment # 5 <https://bugs.schedmd.com/show_bug.cgi?id=3307#c5> on bug
> 3307 <https://bugs.schedmd.com/show_bug.cgi?id=3307> from Tim Wickberg
> <tim@schedmd.com> *
>
> Marking resolved/infogiven as I believe you'd found a workaround. Please reopen
> if you have further questions on this issue.
>
> - Tim
>
> ------------------------------
> You are receiving this mail because:
>
>    - You reported the bug.
>
>