| Summary: | Proper use of "sbatch --bb=" command syntax | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | David Paul <dpaul> |
| Component: | Burst Buffers | Assignee: | Tim Wickberg <tim> |
| Status: | RESOLVED FIXED | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | CC: | dpaul, tim |
| Version: | 15.08.7 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | NERSC | Slinky Site: | --- |
| Alineos Sites: | --- | Atos/Eviden Sites: | --- |
| Confidential Site: | --- | Coreweave sites: | --- |
| Cray Sites: | --- | DS9 clusters: | --- |
| Google sites: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | NoveTech Sites: | --- |
| Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Tzag Elita Sites: | --- |
| Linux Distro: | --- | Machine Name: | |
| CLE Version: | Version Fixed: | 17.02-pre1 | |
| Target Release: | --- | DevPrio: | --- |
| Emory-Cloud Sites: | --- | ||
|
Description
David Paul
2016-02-02 03:02:05 MST
You still need to provide a batch script - if they hit enter with that command line you gave the sbatch command is expecting the script file to be provided on stdin (and terminated with Ctrl-D). I'm guessing they got a blank line back on the terminal (which was sbatch listening on stdin), then hit Ctrl-C which cancelled the request - so no request would have been sent to slurmctld. If they don't want to create an empty job script to give as an argument, you can use --wrap "" as an argument like so: sbatch --bb="create_persistent name=dpaul50T capacity=50TB access=striped type=scratch" --wrap "" Did you have any further questions on this, or can I go ahead and mark this as resolved? - Tim Did you have any further questions on this, or can I go ahead and mark this as resolved? - Tim Marking as resolved/infogiven. Sorry for the delay replying.
The command does not create the persistent reservation.
[dpaul@cori03]==> sbatch --bb="create_persistent name=dpaul200GB capacity=200GB access=striped type=scratch" --wrap ""
Submitted batch job 1132507
[dpaul@cori03]==> squeue -l -u dpaul
Fri Feb 12 10:56:35 2016
JOBID PARTITION NAME USER STATE TIME TIME_LIMI NODES NODELIST(REASON)
1132507 debug wrap dpaul RUNNING 0:01 10:00 1 nid00092
[dpaul@cori03]==> scontrol show job 1132507
JobId=1132507 JobName=wrap
UserId=dpaul(15448) GroupId=dpaul(1015448)
Priority=28929 Nice=0 Account=mpccc QOS=premium
JobState=COMPLETED Reason=None Dependency=(null)
Requeue=0 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
RunTime=00:00:04 TimeLimit=00:10:00 TimeMin=N/A
SubmitTime=2016-02-12T10:55:00 EligibleTime=2016-02-12T10:55:00
StartTime=2016-02-12T10:56:34 EndTime=2016-02-12T10:56:38
PreemptTime=None SuspendTime=None SecsPreSuspend=0
Partition=debug AllocNode:Sid=cori03:33651
ReqNodeList=(null) ExcNodeList=(null)
NodeList=nid00092
BatchHost=nid00092
NumNodes=1 NumCPUs=64 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
TRES=cpu=64,mem=124928,node=1
Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
MinCPUsNode=1 MinMemoryNode=122G MinTmpDiskNode=0
Features=(null) Gres=craynetwork:1 Reservation=(null)
Shared=0 Contiguous=0 Licenses=(null) Network=(null)
Command=(null)
WorkDir=/global/u1/d/dpaul
StdErr=/global/u1/d/dpaul/slurm-1132507.out
StdIn=/dev/null
StdOut=/global/u1/d/dpaul/slurm-1132507.out
Power= SICP=0
Looks like my understanding of the --bb option was incomplete, sorry about that. The option parser is handling the line given in --bb="$FOO" as if it were a line from an sbatch file, and is looking for the #BB or #DW characters at the start. So, this should work: sbatch --bb="#BB create_persistent name=dpaul200GB capacity=200GB access=striped type=scratch" --wrap "" The --wrap "" doesn't impact anything - if you'd submitted a script you still wouldn't have gotten the persistent buffer. Slurm should warn about the invalid --bb argument - the option parser currently ignores any line not starting with a # but doesn't return an error which is a bug. There appear to be some other quirks in how the --bb argument works compared to placing directives in the job script, I'm looking into this further. Actual, on further review the "sbatch --bb " option is ignored completely. With a leading # or not, the argument is thrown away before we parse it. salloc and srun do support --bb with the #BB format as described. At the moment, this would get create your buffer as intended: srun --bb="#BB create_persistent name=dpaul200GB capacity=200GB access=striped type=scratch" date I'm looking a fix for sbatch now. David, Tim and I exchanged a few ideas about this some time ago. The --bb/-bbf options should work fine for salloc and srun. The sbatch command is more complex as the user can specify conflicting options in the job script and on the command line (via --bb/bbf options). Here are a couple of ideas: 1. Disable --bb/bbf options for the sbatch command and force users to specify options in the script 2. Try to merge command line options with those in the script, which seems fraught with peril and provides little real benefit. Comments? (In reply to Moe Jette from comment #12) > David, > > Tim and I exchanged a few ideas about this some time ago. The --bb/-bbf > options should work fine for salloc and srun. The sbatch command is more > complex as the user can specify conflicting options in the job script and on > the command line (via --bb/bbf options). > > Here are a couple of ideas: > 1. Disable --bb/bbf options for the sbatch command and force users to > specify options in the script > 2. Try to merge command line options with those in the script, which seems > fraught with peril and provides little real benefit. This is what I've done: 1. In version 16.05, documented that the --bb option can NOT be used to create or destroy persistent burst buffers. I've also added logic return an error if someone tries to create or destroy persistent burst buffer using the -bb option so that it is more clear what is happening. Note that the --bbf option works for salloc, srun and sbatch to create or destroy persistent burst buffers. 2. In version 17.02, remove the sbatch --bb option, which does not work in any version as far as I can tell. 3. In version 17.02, added the sbatch --bbf option, which will merge the file specified with the --bbf option into user's script. I believe this is probably the best way to address the problem you have reported here. Marking this as closed. Moe's outlined out approach to handling this with Comment 13. |