Ticket 1665 - GrpSubmitJobs and job arrays
Summary: GrpSubmitJobs and job arrays
Status: OPEN
Alias: None
Product: Slurm
Classification: Unclassified
Component: Limits (show other tickets)
Version: 14.11.4
Hardware: Linux Linux
: 5 - Enhancement
Assignee: Unassigned Developer
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2015-05-12 08:48 MDT by Ryan Cox
Modified: 2017-12-11 10:29 MST (History)
0 users

See Also:
Site: BYU - Brigham Young University
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description Ryan Cox 2015-05-12 08:48:22 MDT
I keep having to remember the interaction of several limits with job arrays and want to both validate my understanding of them and suggest that they be added to http://slurm.schedmd.com/job_array.html.

MaxJobCount (slurm.conf) - each array task counts as a separate job (the manpage and job_array.html are very clear on this one).
GrpJobs/MaxJobs - no difference since each array task ends up as a separate job entry once it runs
GrpSubmitJobs/MaxSubmitJobs - each array task is considered as a separate job (unless my tests are bad).  Therefore a pending array of 1000 tasks counts as 1000 jobs.  If GrpSubmitJobs is set to 1000, no more jobs can be submitted.

In testing GrpSubmitJobs (and I assume it's the same for MaxSubmitJobs), it seems like the behavior is undesirable (some guy asked for it in bug 586).  If a user needs to submit 100k jobs, the best way for him to do that is with a job array of size 100k since it's the lowest overhead.  However, in making GrpSubmitJobs large enough for the 100k size job array, that also allows him to submit 100k individual jobs (not good).

What I would like is to limit the user to, say, 1k jobs but allow much larger job arrays with MaxArraySize (and a big increase in MaxJobCount).  I can then incentivize good behavior (job arrays) without allowing high overhead behavior (individual jobs).  Back when job arrays were just a convenience rather than a performance improvement (e.g. bug 586), it made sense for GrpSubmitJobs/MaxSubmitJobs to account for each task individually.  I don't think it makes sense anymore.  What do you think about changing the behavior of those limits so that each job array counts as one job?
Comment 1 Moe Jette 2015-05-12 08:57:29 MDT
I agree that we probably need better controls with job arrays, but there are some issues to consider. Here's a good example of a problem.

Say a user submits a job array with 1000 tasks - that's all one job record.
Then say the user changes the time limit on each odd numbered task in the array, which is simple with a regular expression. Suddenly we've got an extra 500 job records. (Each changed element in the job array spits out a new job record). We wouldn't want to partly perform the update request. Should we just reject the entire request or perhaps force the user to operate on the whole job array as a single entity (that would not create new job records as I recall).
Comment 2 Ryan Cox 2015-05-12 09:01:47 MDT
That's a very good example and I didn't realize that it's even an option.  Personally I would be okay denying that request but I imagine that someone is interested in allowing that behavior.  I'm not quite sure what to do about it but I can see how that job update would really mess things up.  My simple idea doesn't seem so simple anymore...
Comment 3 Moe Jette 2015-05-12 09:09:00 MDT
(In reply to Ryan Cox from comment #2)
> That's a very good example and I didn't realize that it's even an option. 
> Personally I would be okay denying that request but I imagine that someone
> is interested in allowing that behavior.  I'm not quite sure what to do
> about it but I can see how that job update would really mess things up.  My
> simple idea doesn't seem so simple anymore...

Partial job array updates are the only serious problem that I can think off offhand, and I suspect it's a rare event.
Comment 4 Ryan Cox 2015-05-12 09:15:05 MDT
I would be okay with making updates an all-or-nothing operation for the pending portion of an array.  Someone somewhere might depend on partial array updates but it doesn't make much sense to me.  The point of an array job is that it's homogeneous, right?  If partial updates are blocked, I assume that the entire pending portion of an array could still be updated even with some tasks running already?  I don't see why not since the running jobs would have separate job records at that point.
Comment 5 Moe Jette 2015-05-12 09:34:24 MDT
(In reply to Ryan Cox from comment #4)
> I would be okay with making updates an all-or-nothing operation for the
> pending portion of an array.  Someone somewhere might depend on partial
> array updates but it doesn't make much sense to me.  The point of an array
> job is that it's homogeneous, right?

You might think so, but that is definitely not the mode of operation at some sites. One site in particular manages each task in a job array very much independently and they work with really large job arrays.


> If partial updates are blocked, I
> assume that the entire pending portion of an array could still be updated
> even with some tasks running already? I don't see why not since the running
> jobs would have separate job records at that point.

That's not a problem.