I keep having to remember the interaction of several limits with job arrays and want to both validate my understanding of them and suggest that they be added to http://slurm.schedmd.com/job_array.html. MaxJobCount (slurm.conf) - each array task counts as a separate job (the manpage and job_array.html are very clear on this one). GrpJobs/MaxJobs - no difference since each array task ends up as a separate job entry once it runs GrpSubmitJobs/MaxSubmitJobs - each array task is considered as a separate job (unless my tests are bad). Therefore a pending array of 1000 tasks counts as 1000 jobs. If GrpSubmitJobs is set to 1000, no more jobs can be submitted. In testing GrpSubmitJobs (and I assume it's the same for MaxSubmitJobs), it seems like the behavior is undesirable (some guy asked for it in bug 586). If a user needs to submit 100k jobs, the best way for him to do that is with a job array of size 100k since it's the lowest overhead. However, in making GrpSubmitJobs large enough for the 100k size job array, that also allows him to submit 100k individual jobs (not good). What I would like is to limit the user to, say, 1k jobs but allow much larger job arrays with MaxArraySize (and a big increase in MaxJobCount). I can then incentivize good behavior (job arrays) without allowing high overhead behavior (individual jobs). Back when job arrays were just a convenience rather than a performance improvement (e.g. bug 586), it made sense for GrpSubmitJobs/MaxSubmitJobs to account for each task individually. I don't think it makes sense anymore. What do you think about changing the behavior of those limits so that each job array counts as one job?
I agree that we probably need better controls with job arrays, but there are some issues to consider. Here's a good example of a problem. Say a user submits a job array with 1000 tasks - that's all one job record. Then say the user changes the time limit on each odd numbered task in the array, which is simple with a regular expression. Suddenly we've got an extra 500 job records. (Each changed element in the job array spits out a new job record). We wouldn't want to partly perform the update request. Should we just reject the entire request or perhaps force the user to operate on the whole job array as a single entity (that would not create new job records as I recall).
That's a very good example and I didn't realize that it's even an option. Personally I would be okay denying that request but I imagine that someone is interested in allowing that behavior. I'm not quite sure what to do about it but I can see how that job update would really mess things up. My simple idea doesn't seem so simple anymore...
(In reply to Ryan Cox from comment #2) > That's a very good example and I didn't realize that it's even an option. > Personally I would be okay denying that request but I imagine that someone > is interested in allowing that behavior. I'm not quite sure what to do > about it but I can see how that job update would really mess things up. My > simple idea doesn't seem so simple anymore... Partial job array updates are the only serious problem that I can think off offhand, and I suspect it's a rare event.
I would be okay with making updates an all-or-nothing operation for the pending portion of an array. Someone somewhere might depend on partial array updates but it doesn't make much sense to me. The point of an array job is that it's homogeneous, right? If partial updates are blocked, I assume that the entire pending portion of an array could still be updated even with some tasks running already? I don't see why not since the running jobs would have separate job records at that point.
(In reply to Ryan Cox from comment #4) > I would be okay with making updates an all-or-nothing operation for the > pending portion of an array. Someone somewhere might depend on partial > array updates but it doesn't make much sense to me. The point of an array > job is that it's homogeneous, right? You might think so, but that is definitely not the mode of operation at some sites. One site in particular manages each task in a job array very much independently and they work with really large job arrays. > If partial updates are blocked, I > assume that the entire pending portion of an array could still be updated > even with some tasks running already? I don't see why not since the running > jobs would have separate job records at that point. That's not a problem.