5650 – [INFORMATION] Looking for Slurm parallels

Ticket 5650 - [INFORMATION] Looking for Slurm parallels

Summary: [INFORMATION] Looking for Slurm parallels

Status:	RESOLVED INFOGIVEN

Alias:	None

Product:	Slurm
Classification:	Unclassified
Component:	Configuration (show other tickets)
Version:	17.11.7
Hardware:	Linux Linux

Severity:	4 - Minor Issue
Assignee:	Tim Wickberg
QA Contact:

URL:

Depends on:
Blocks:

Reported:	2018-08-30 13:55 MDT by tyler.boswick
Modified:	2018-09-06 14:17 MDT (History)
CC List:	0 users

See Also:
Site:	NOAA
Slinky Site:	---
Alineos Sites:	---
Atos/Eviden Sites:	---
Confidential Site:	---
Coreweave sites:	---
Cray Sites:	---
DS9 clusters:	---
Google sites:	---
HPCnow Sites:	---
HPE Sites:	---
IBM Sites:	---
NOAA SIte:	GFDL
NoveTech Sites:	---
Nvidia HWinf-CS Sites:	---
OCF Sites:	---
Recursion Pharma Sites:	---
SFW Sites:	---
SNIC sites:	---
Tzag Elita Sites:	---
Linux Distro:	---
Machine Name:
CLE Version:
Version Fixed:
Target Release:	---
DevPrio:	---
Emory-Cloud Sites:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this ticket.

Description tyler.boswick 2018-08-30 13:55:38 MDT

Hello, I am looking for some information on items that we do currently on our scheduler that have a Slurm parallel, and if not what we can do to help facilitate our transition.

1) We would like to limit the amount of CPUs on nodes.  Within slurm.conf, are we able to specify CPUs = Any# < Actual CPU count(eg, actual is 8, but we would like to specify 4), or are we forced to specify the CPU count.  If we must keep the actual amount of CPUs in this parameter, what is the accepted way to limit the amount of 'slots' on a Node?

2) Is it possible within Slurm to hold all jobs for a single user at a time, or does this require a loop to go through each of a user's jobs to requeue one by one, constantly to ensure they do not run any work?

3) Is it possible within Slurm to requeue a single user's jobs, or does this require a loop to go through each of a user's jobs to requeue one by one?

4) What is the proper way within Slurm to 'Pause' scheduling.  Expected behavior is to still accept new jobs but not allow any work to start.

5) Does Slurm have a concept of 'purgetime'?  As in, a way to see only completed jobs, and you can only see the completed jobs within a past window, 
  ex: running command X shows you all completed jobs from the past 12 hours, but anything beyond the 12 hour window requires further digging within sacct.

Comment 1 Tim Wickberg 2018-09-06 14:17:19 MDT

Updating and closing the ticket out based on responses given during training.

Comment 2 Tim Wickberg 2018-09-06 14:17:36 MDT

Switching to resolved/infogiven.