Ticket 3658

Summary:	jobs do not run if shared and oversubscribe is enabled
Product:	Slurm	Reporter:	Robert Yelle <ryelle>
Component:	Configuration	Assignee:	Tim Wickberg <tim>
Status:	RESOLVED INFOGIVEN	QA Contact:
Severity:	3 - Medium Impact
Priority:	---
Version:	16.05.8
Hardware:	Linux
OS:	Linux
Site:	University of Oregon	Slinky Site:	---
Alineos Sites:	---	Atos/Eviden Sites:	---
Confidential Site:	---	Coreweave sites:	---
Cray Sites:	---	DS9 clusters:	---
Google sites:	---	HPCnow Sites:	---
HPE Sites:	---	IBM Sites:	---
NOAA SIte:	---	NoveTech Sites:	---
Nvidia HWinf-CS Sites:	---	OCF Sites:	---
Recursion Pharma Sites:	---	SFW Sites:	---
SNIC sites:	---	Tzag Elita Sites:	---
Linux Distro:	---	Machine Name:
CLE Version:		Version Fixed:
Target Release:	---	DevPrio:	---
Emory-Cloud Sites:	---
Attachments:	slurm.conf ATT00001.htm

Description Robert Yelle 2017-04-03 12:27:27 MDT

Hello,

We need to have the ability to allow more than one job to use a node (e.g. multiple serial jobs).  I understand that the default behavior in Slurm is one job per node.  I tried changing the slurm.conf file and set shared=yes and oversubscribed=yes for each partition, but after doing this and restarting slurmctld, all jobs sit in the queue.  Is there another parameter to set to allow multiple jobs per node?

Thanks,

Rob Yelle
Univ of Oregon

Comment 1 Tim Wickberg 2017-04-03 12:34:50 MDT

Shared/Oversubscribed don't directly affect this, and I'd suggest returning those to the default settings.

What you're looking for Slurm refers to as "Consumable Resources", and is documented at https://slurm.schedmd.com/cons_res.html .

Briefly, what you're looking to change is to start allocating the individual CPU (and possibly memory) within each node, rather than the whole node. This does have some other ramifications, and I'd encourage you to test this out before making this adjustment.

If you can attach your current slurm.conf file it'd help me know what settings to recommend as well.

- Tim

Comment 2 Robert Yelle 2017-04-03 16:46:12 MDT

Created attachment 4285 [details]
slurm.conf

Hi Tim,

Thank you for your response, and for the link to the cons_res page.  I made the following changes to slurm.conf:

SelectType=select/cons_res
SelectTypeParameters=CR_Core_Memory

Then restarted slurmctld (and slurmd on compute nodes).  When a job is submitted, the job does not run, but I get this error message in /var/log/slurmctld:

error: cons_res: cr_job_test core_bitmap index error on node n050
sched: _slurm_rpc_allocate_resources JobID-6893 NodeList=(null) usec=547

I am also getting “bad core count” messages from some of the nodes after this change, seems like additional configuration is required to properly set this if cons_res is used?

Yes, I would be very interested in the settings that you recommend for this - see attached slurm.conf file.

Thanks!

Rob

Comment 3 Robert Yelle 2017-04-03 16:46:13 MDT

Created attachment 4286 [details]
ATT00001.htm

Comment 4 Tim Wickberg 2017-04-03 16:53:54 MDT

You'll need to add definitions to the NodeName line to inform it of the Memory and CPU layout of the nodes.

As an example line from one of my configs:

NodeName=scruffy NodeAddr=scruffy Port=30101 Sockets=2 CoresPerSocket=4 ThreadsPerCore=1 RealMemory=64000

The RealMemory line can be a bit less than the actual value (in MB) on the node - that'll effectively reserve a bit for the OS itself.

It looks like you're using Bright Cluster Manager? I'm not sure how to get it to automatically fill that section in, you may need to refer to their documentation on how best to update this.

- Tim

Comment 5 Robert Yelle 2017-04-03 16:57:46 MDT

Hi Tim,

Thanks!  I figured it was something like that.  I will make those changes and let you know what happens.

Yes, we are using Bright CM 7.3.  Per their recommendations I have frozen the slurm.conf file so that Bright won’t change it, so slurm.conf is totally in my control now.

Cheers,

Rob


On Apr 3, 2017, at 3:53 PM, bugs@schedmd.com<mailto:bugs@schedmd.com> wrote:


Comment # 3<https://bugs.schedmd.com/show_bug.cgi?id=3658#c3> on bug 3658<https://bugs.schedmd.com/show_bug.cgi?id=3658> from Robert Yelle<mailto:ryelle@uoregon.edu>

Created attachment 4286 [details]<x-msg://56/attachment.cgi?id=4286> [details]<x-msg://56/attachment.cgi?id=4286&action=edit>
ATT00001.htm

Comment # 4<https://bugs.schedmd.com/show_bug.cgi?id=3658#c4> on bug 3658<https://bugs.schedmd.com/show_bug.cgi?id=3658> from Tim Wickberg<mailto:tim@schedmd.com>

You'll need to add definitions to the NodeName line to inform it of the Memory
and CPU layout of the nodes.

As an example line from one of my configs:

NodeName=scruffy NodeAddr=scruffy Port=30101 Sockets=2 CoresPerSocket=4
ThreadsPerCore=1 RealMemory=64000

The RealMemory line can be a bit less than the actual value (in MB) on the node
- that'll effectively reserve a bit for the OS itself.

It looks like you're using Bright Cluster Manager? I'm not sure how to get it
to automatically fill that section in, you may need to refer to their
documentation on how best to update this.

- Tim

________________________________
You are receiving this mail because:

  *   You reported the bug.

Comment 6 Tim Wickberg 2017-04-18 20:32:16 MDT

I'm marking this resolved/infogiven; please re-open if there was anything further I could answer here.

- Tim

Comment 7 Robert Yelle 2017-04-19 08:49:45 MDT

Hi Tim,

Sorry, I meant to get back to you sooner on this.  Thank you for your assistance on this matter, your proposed solution solved our problem.  Go ahead and close the ticket.

Cheers,

Rob


On Apr 18, 2017, at 7:32 PM, bugs@schedmd.com<mailto:bugs@schedmd.com> wrote:

Tim Wickberg<mailto:tim@schedmd.com> changed bug 3658<https://bugs.schedmd.com/show_bug.cgi?id=3658>
What    Removed Added
Status  UNCONFIRMED     RESOLVED
Resolution      ---     INFOGIVEN

Comment # 6<https://bugs.schedmd.com/show_bug.cgi?id=3658#c6> on bug 3658<https://bugs.schedmd.com/show_bug.cgi?id=3658> from Tim Wickberg<mailto:tim@schedmd.com>

I'm marking this resolved/infogiven; please re-open if there was anything
further I could answer here.

- Tim

________________________________
You are receiving this mail because:

  *   You reported the bug.