scrontab support has now been merged into master as of the following commit: commit 882510ae2bfbbe0cd5813631abe063e4178ad537 Merge: 7b74b20b31 7fb52b2304 Author: Tim Wickberg <tim@schedmd.com> AuthorDate: Sun Oct 25 22:19:03 2020 -0600 Merge branch 'cron' There is initial documentation available in the man page. It does require setting a new option - ScronParameters=enable - to enable support. It is expected that you will want to setup some specific cli_filter and/or job_submit routing to a dedicated queue for processing these. If you have some general feedback on this we'd be happy to add additional documentation. As with all new Slurm features, I'm sure there will be some initial teething problems. We may be able to address some of that in the 20.11 release cycle through additional ScronParameters options depending on complexity. Let me know if you run into any problems, otherwise I'm tagging this development project as complete. - Tim
Hi Tim, Thanks so much for this! I've been testing it a little on Gerty and so far it works. The only thing I've noticed is that I have to repeat the these definitions for every line I add: # min hour day-of-month month day-of-week command #SCRON -q xfer #SCRON -A nstaff #SCRON --time 1 */5 * * * * bash -c '(date; echo ${SLURM_JOB_ID}) > /tmp/csamuel.test' #SCRON -q xfer #SCRON -A nstaff #SCRON --time 1 @hourly bash -c '(date; echo ${SLURM_JOB_ID}) > /global/homes/c/csamuel/scrontab.hourly #SCRON -q xfer #SCRON -A nstaff #SCRON --time 1 @daily bash -c '(date; echo ${SLURM_JOB_ID}) > /global/homes/c/csamuel/scrontab.daily I was wondering if it was possible for it to remember the previous ones if none were specified (and forget everything if something is mentioned). Say: # min hour day-of-month month day-of-week command #SCRON -q xfer #SCRON -A nstaff #SCRON --time 1 */5 * * * * bash -c '(date; echo ${SLURM_JOB_ID}) > /tmp/csamuel.test' @hourly bash -c '(date; echo ${SLURM_JOB_ID}) > /global/homes/c/csamuel/scrontab.hourly 15 */3 * * * bash -c 'last csamuel | Mail -s "Very bad IDS" csamuel@lbl.gov' ## # Do some work ## #SCRON -q gpu #SCRON -A gofast #SCRON --time 15:35:0 @daily ./my_gpu_code Thoughts? All the best, Chris
Hi Tim, Is there a way via cli_filter or the submit filter to identify these for policy application? I can see that the submit filter does seem to be applied as without those precursor lines I get the error: There was an issue with the job submission on lines (null) The error code return was: Unspecified error The error message was: Unable to determine account name. Please resubmit your job specifying account with -A. The failed lines are commented out with #BAD: Do you want to retry the edit? (y/n) Though the line doesn't get commented out with #BAD in this situation (it does if it fails to be parsed, say if I remove the @ from in front of @daily). All the best, Chris
Hi Tim, Final thing for the night - I've seen this get reported: csamuel@gert01:/global/gscratch1/sd/csamuel/slurm/git/src/scrontab> scrontab -e scrontab: error: cronspec_to_bitstring: at format scrontab: error: cronspec_to_bitstring: at format This is for this crontab: # min hour day-of-month month day-of-week command #SCRON -q xfer #SCRON -A nstaff #SCRON --time 1 */5 * * * * bash -c '(date; echo ${SLURM_JOB_ID}) > /tmp/csamuel.test' #SCRON -q xfer #SCRON -A nstaff #SCRON --time 1 @hourly bash -c '(date; echo ${SLURM_JOB_ID}) >> /global/homes/c/csamuel/scrontab.hourly #SCRON -q xfer #SCRON -A nstaff #SCRON --time 1 @daily bash -c '(date; echo ${SLURM_JOB_ID}) >> /global/homes/c/csamuel/scrontab.daily Looks like it happens in cronspec_to_bitstring() in src/scrontab/parse.c: if (*pos == '@') { error("%s: at format", __func__); I'm guessing this is some left over debugging? All the best, Chris
> Final thing for the night - I've seen this get reported: > scrontab: error: cronspec_to_bitstring: at format Fixed (7703e8ae07), that was a stray debugging line. > There was an issue with the job submission on lines (null) > <snip> > Though the line doesn't get commented out with #BAD in this situation (it > does if it fails to be parsed, say if I remove the @ from in front of > @daily). The (null) there was the issue. Fixed (3270b537b9).
This fixes some edge cases I found on review, please make sure further testing includes it as well: commit fb82f284ec9995d8775562e0f0202062e8ada450 Author: Tim Wickberg <tim@schedmd.com> AuthorDate: Fri Oct 30 01:45:31 2020 -0600 Only save updated crontab on successful submission. Also fixes an issue where a new crontab with no jobs or 'crontab -r' would not remove prior crontab jobs as the return path bailed out too soon.
(In reply to Chris Samuel (NERSC) from comment #2) > Hi Tim, > > Thanks so much for this! I've been testing it a little on Gerty and so far > it works. The only thing I've noticed is that I have to repeat the these > definitions for every line I add: This is intentional, and is documented in the example and in the man page: Example: # Lines starting with #SCRON will be parsed for options to use # with the next cron line. E.g., "#SCRON --time 1" would request # a one minute timelimit be applied. See the sbatch man page for # options, although note that not all options are supported here. Man page: "Options are always reset in between each crontab entry." > I was wondering if it was possible for it to remember the previous ones if > none were specified (and forget everything if something is mentioned). I'd considered that, but I think that'd be just as confusing in some respects. In my envisioned use, these #SCRON lines are best avoided, and I'd expect some magic from cli_filter and/or job_submit to be doing the real heavy lifting. Thus not wanting to complicate the (already complicated) parser further. I'm willing to reconsider this if you think that's going to be an issue, but IMO any solution is going to involve at least some opportunities for confusion. (In reply to Chris Samuel (NERSC) from comment #3) > Hi Tim, > > Is there a way via cli_filter or the submit filter to identify these for > policy application? > > I can see that the submit filter does seem to be applied as without those > precursor lines I get the error: Job submit sees them all as individual submissions. And submissions stop being processed on the first error - then all the submitted jobs will be flushed out until a completely acceptable submission makes it in. cli_filter is only setting up default options at the moment. I'd meant to enable it but have not yet. As for identification - the cronspec field (or crontab_entry) is the best sign of these being different. Although there is no Lua representation of that field yet - I gather that'd be of interest?
Hi Tim, Thanks for the fixes, I've just pulled them and will rebuild. Gerty is getting upgraded to the latest patchset (PS16) today so my ability for testing might be limited. (In reply to Tim Wickberg from comment #7) > (In reply to Chris Samuel (NERSC) from comment #2) > > Hi Tim, > > > > Thanks so much for this! I've been testing it a little on Gerty and so far > > it works. The only thing I've noticed is that I have to repeat the these > > definitions for every line I add: > > This is intentional, and is documented in the example and in the man page: Yeah, I saw that, just wondering from a usability point of view, but if (as you say later) there is a way to pick up the fact that these are scron jobs then we can deal with them there. No biggy. > cli_filter is only setting up default options at the moment. I'd meant to > enable it but have not yet. OK thanks, that would be handy. > As for identification - the cronspec field (or crontab_entry) is the best > sign of these being different. Although there is no Lua representation of > that field yet - I gather that'd be of interest? Most definitely! Thanks so much! All the best, Chris
Hi Tim, Those fixes look good, thanks for that. The only feature I've noticed from crontab that's missing is the ability to do "scrontab my_custom_things.cron" to read in a pre-prepared crontab from a file. I'll do some more testing once the PS16 work is done. All the best, Chris
> The only feature I've noticed from crontab that's missing is the ability to > do "scrontab my_custom_things.cron" to read in a pre-prepared crontab from a > file. Unless you think there's a huge demand for that, I'd rather not implement it. In the same way that 'scrontab' defaults to editing when no options are given - rather than crontab's POSIX-required behavior of trying to read from stdid - I view that as a somewhat sharp-edged mode of operation that I'd rather not offer at this time.
Hi Tim, I'll check with the consultants, I'm not aware of how folks set theirs up so I'm not sure if there's any automated tooling that expects to be able to install a crontab non-interactively. All the best, Chris
> I'll check with the consultants, I'm not aware of how folks set theirs up so > I'm not sure if there's any automated tooling that expects to be able to > install a crontab non-interactively. Sounds fine. And such interactive shenanigans are something I don't mind blocking... unless they're very carefully written, they're likely to just blow away any existing scrontab content the user already had. And wouldn't know to use #SCRON if necessary.
Hi Tim, (In reply to Tim Wickberg from comment #7) > Man page: Is it possible to add the preamble that scrontab puts into a new entry to the man page as an example please? That might help stimulate interest in it if a new user can see the fact that it has a familiar feel to it. Also maybe mention the ability to use "@yearly", "@annually", "@monthly", "@weekly", "@daily", "@midnight" and "hourly"? I noticed it looks like something odd is happening with the formatting, on SLES 15 this reads oddly, as if there is missing text: --------------------------------------------------------------- Lines must be either comments starting with entries. Lines starting with following crontab entry. Options are always reset in between each crontab entry.Options include most of those available to the sbatch command; details are available in sbatch(1). --------------------------------------------------------------- Looking at the nroff source it looks like there is a \# in both cases and the following text from that line is getting lost, I think that might be because in troff \ can start a comment in the source. A quick test shows that dropping the \ seems to be enough to fix that. Also there's no space before "Options" and the preceding full stop, that looks more like it might just be a missing newline before "Options" from the nroff. All the best, Chris
We're working on tidying up the documentation, that should happen ahead of 20.11 proper. To summarize what else is outstanding from my point of view: - Expose something in job_submit.lua indicating the job is from scrontab. - Hook up cli_filter inside the scrontab command.
(In reply to Tim Wickberg from comment #15) > We're working on tidying up the documentation, that should happen ahead of > 20.11 proper. No worries, thanks for that! > To summarize what else is outstanding from my point of view: > > - Expose something in job_submit.lua indicating the job is from scrontab. > > - Hook up cli_filter inside the scrontab command. Sounds about right to me. Is there a way to sbatch in a new cron job? All the best, Chris
> > To summarize what else is outstanding from my point of view: > > > > - Expose something in job_submit.lua indicating the job is from scrontab. > > > > - Hook up cli_filter inside the scrontab command. > > Sounds about right to me. > > Is there a way to sbatch in a new cron job? Not at present, no. The main issue I ran into is there'd be no easy way to represent sbatch-cron-submissions in scrontab itself, so it'd be a bit confusing to manage these. (I save the raw crontab which is turned into each job record, but there's no way to translate a job record back to a set of #SCRON lines without needing to lay out every single possible option.) If you see use from having some form of recurrence available through sbatch I'm not opposed to adding it in a future release, but I can't make the RPC changes that would be necessary to enable that in 20.11 at this point.
(In reply to Tim Wickberg from comment #17) > Not at present, no. No worries, this was just me being curious and thinking about what we might need to document. Much obliged! Chris
*** Ticket 10167 has been marked as a duplicate of this ticket. ***
Does scrontab check that a user is enabled (that is, their shell isn't /sbin/nologin or /bin/false) before running their jobs? Or is there some similar 'locked' feature in sacctmgr that would allow us to disable/enable users cron jobs to ensure we don't have any phantom jobs doing stuff after users are gone.
(In reply to Gordon Dexter from comment #24) > Does scrontab check that a user is enabled (that is, their shell isn't > /sbin/nologin or /bin/false) before running their jobs? Or is there some > similar 'locked' feature in sacctmgr that would allow us to disable/enable > users cron jobs to ensure we don't have any phantom jobs doing stuff after > users are gone. I think it's the same as any other Slurm job in this respect. For instance if we need to disable a user (usually because their jobs are disrupting Slurm or causing other system issues) we set them to only have access to a "batchdisable" QOS which has no ability to run jobs. I'll try and find some time today to test that in our test system. All the best, Chris
(In reply to Chris Samuel (NERSC) from comment #11) > Hi Tim, > > I'll check with the consultants, I'm not aware of how folks set theirs up so > I'm not sure if there's any automated tooling that expects to be able to > install a crontab non-interactively. > > All the best, > Chris We do a lot of user setup via automated scripts, and we try to make things so the user can hit the ground running, so it could be useful to have some way to modify an scrontab file programatically. I agree that the default should be edit though.
Okay: job_submit/lua and cli_filter adjustments are in ahead of rc2 tomorrow. Note this required a breaking RPC change to scrontab - the rc1 scrontab will not communicate with rc2 and the final release versions of slurmctld. - Tim commit 843b8dbc078c0faba12d51520d505673061ed1c8 Author: Tim Wickberg <tim@schedmd.com> AuthorDate: Wed Nov 11 18:25:01 2020 -0800 job_submit/lua - expose a "cron_job" boolean field Bug 10056. commit 59e4105b18186479aee4079fbb7f424779116a83 Author: Tim Wickberg <tim@schedmd.com> AuthorDate: Wed Nov 11 17:06:21 2020 -0800 scrontab - add cli_filter hooks. Bug 10056. commit 97e9abe22568a22ce92e3272c6cb3ab4abc283a6 Author: Tim Wickberg <tim@schedmd.com> AuthorDate: Wed Nov 11 17:42:47 2020 -0800 Populate the jobids array in crontab_update_response_msg_t. commit 9e3aa68615fd99e751d36a20a4cbadb4f995c6e3 Author: Tim Wickberg <tim@schedmd.com> AuthorDate: Wed Nov 11 17:35:43 2020 -0800 Send array of jobids as part of crontab_update_response_msg_t. This is a breaking RPC change for scrontab. commit 59b3ab570f37f95d3caeb9ec217fd5c50ffb07cf Author: Tim Wickberg <tim@schedmd.com> AuthorDate: Wed Nov 11 16:07:06 2020 -0800 Tweak CRON_JOB flag handling. Ensure flag is always set on scrontab-submitted jobs in job_submit. commit 1306233ae9061dd1f22928f61e95efa6d698a9e5 Author: Tim Wickberg <tim@schedmd.com> AuthorDate: Wed Nov 11 15:48:40 2020 -0800 scrontab - change line parsing error handling around. Which will make it simpler to add in cli_filter support.
Thanks Tim! I'd completely missed this update. Much appreciated!
Created attachment 17093 [details] Proposed patch While looking over the 20.11 RELEASE_NOTES, I tend to "git grep" and look over the commits for more information. Looking at the ScronParameters change, it appears the wrong structure member is being assigned in src/api/config_info.c:slurm_ctl_conf_2_key_pairs()?
Thanks Josh. Applied ahead of 20.11.1. In the future can you please split bug fixes off into new tickets? We generally like to avoid reopening the original development tracking tickets. thanks, - Tim commit f8e7df5027ca2a540cd8b6d699a6f0b666bd7143 Author: Josh Samuelson <josh@1up.unl.edu> AuthorDate: Wed Dec 9 23:13:51 2020 -0600 Assign correct value variable for ScronParameters key. Otherwise 'scontrol show config' (and other API consumers) will display the wrong value. Bug 10056.