Ticket 725

Summary: scontrol function on job name
Product: Slurm Reporter: Stuart Midgley <stuartm>
Component: OtherAssignee: Nathan Yee <nyee32>
Status: RESOLVED FIXED QA Contact:
Severity: 5 - Enhancement    
Priority: --- CC: da, dylanj
Version: 14.03.0   
Hardware: Linux   
OS: Linux   
Site: DownUnder GeoSolutions Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: 14.11.0-pre5 Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description Stuart Midgley 2014-04-16 14:40:54 MDT
Morning

Users on our system submit clusters of jobs (ie. 200+ job arrays) all with the same queue parameters etc.  They typically refere to them using the job names, rather than job id's (so their queries span multiple job arrays).

scontrol doesn't allow querying based on name.

Users are now doing (as an example)

    squeue -u peterg -n dp_Scale -h -o "%F" | xargs -i -- scontrol update jobid={} queue=teambm,teamswanIdle

which, while not too arduous, is very slow and requires more knowledge of unix than most of our users have.  Being able to do

   scontrol update name=dp_Scale queue=teambm,teamswanIdle

would be more natural.
Comment 1 Stuart Midgley 2014-04-16 14:52:28 MDT
scontrol release <jobname> is also another common task (for lots of jobs in SE state)
Comment 2 Moe Jette 2014-04-25 09:53:22 MDT
(In reply to Stuart Midgley from comment #1)
> scontrol release <jobname> is also another common task (for lots of jobs in
> SE state)

I've addressed this for the hold, release and uhold commands:

https://github.com/SchedMD/slurm/commit/6eaeb85c390363b0835468b7245fe902f642926e
Comment 3 Moe Jette 2014-05-12 07:00:40 MDT
Reassigning to Nathon
Comment 4 Stuart Midgley 2014-07-09 13:32:36 MDT
Morning

Any idea when we might be able to use jobnames for other scontrol job names?

ie. scontrol update job=<jobname> queue=....

would be good.
Comment 5 Moe Jette 2014-07-10 04:08:16 MDT
(In reply to Stuart Midgley from comment #4)
> Morning
> 
> Any idea when we might be able to use jobnames for other scontrol job names?
> 
> ie. scontrol update job=<jobname> queue=....
> 
> would be good.

My best guess is a month from now.
Comment 6 David Bigagli 2014-07-10 04:43:18 MDT
What if you have multiple jobs with the same name?

David
Comment 7 Moe Jette 2014-07-10 04:59:57 MDT
(In reply to David Bigagli from comment #6)
> What if you have multiple jobs with the same name?
> 
> David

Related to that, what if different users have jobs with the same name?
Would the command only effect that user's jobs?
What if the user is root?
Comment 8 Dylan 2014-07-10 13:14:36 MDT
If there are multiple jobs with the same name change them all if the user has permission to change them. (Including root)

For example, here is a typical workflow.

teambmMig,i 200        m_testlines_il7436_c   peterg PD       0:00      1 (Resources)     4193096_[8-64]
teambm      200        a_testlines_il7436_c   peterg PD       0:00      1 (Dependency)    4193160_1
teambmMig,i 200        m_testlines_il7436_c   peterg PD       0:00      1 (Priority)      4193161_[65-128]
teambm      200        a_testlines_il7436_c   peterg PD       0:00      1 (Dependency)    4193225_2
teambmMig,i 200        m_testlines_il7436_c   peterg PD       0:00      1 (Priority)      4193226_[129-192]
teambm      200        a_testlines_il7436_c   peterg PD       0:00      1 (Dependency)    4193290_3
teambmMig,i 200        m_testlines_il7436_c   peterg PD       0:00      1 (Priority)      4193291_[193-256]
teambm      200        a_testlines_il7436_c   peterg PD       0:00      1 (Dependency)    4193355_4
teambmMig,i 200        m_testlines_il7436_c   peterg PD       0:00      1 (Priority)      4193356_[257-320]
teambm      200        a_testlines_il7436_c   peterg PD       0:00      1 (Dependency)    4193420_5
teambmMig,i 200        m_testlines_il7436_c   peterg PD       0:00      1 (Priority)      4193421_[321-384]
teambm      200        a_testlines_il7436_c   peterg PD       0:00      1 (Dependency)    4193485_6
teambmMig,i 200        m_testlines_il7436_c   peterg PD       0:00      1 (Priority)      4193486_[385-448]
teambm      200        a_testlines_il7436_c   peterg PD       0:00      1 (Dependency)    4193550_7
teambmMig,i 200        m_testlines_il7436_c   peterg PD       0:00      1 (Priority)      4193551_[449-512]
teambm      200        a_testlines_il7436_c   peterg PD       0:00      1 (Dependency)    4193615_8
teambm      200        a_testlines_il7436_c   peterg PD       0:00      1 (Dependency)    4193616_9


Job 4193160_1 depends on 4193096_[8-64], job 4193225_2 depends on 4193161_[65-128], etc. Until 4193616_9 that depends on 4193615_8 4193615_7 4193615_6 etc.

If the user wants to change something about this job workflow, currently they have to address each part individually. So for us at least, each workflow has a unique name and it would be useful to be able to scontrol the entire workflow by name.
Comment 9 Moe Jette 2014-09-04 10:49:26 MDT
Fixed in version 14.11:

https://github.com/SchedMD/slurm/commit/efa9cc3abcb71d68d0ec14a791c80728b2a594ff