| Summary: | Can not release jobs | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Stuart Midgley <stuartm> |
| Component: | Other | Assignee: | David Bigagli <david> |
| Status: | RESOLVED INFOGIVEN | QA Contact: | |
| Severity: | 3 - Medium Impact | ||
| Priority: | --- | CC: | da |
| Version: | 14.03.0 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | DownUnder GeoSolutions | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | Target Release: | --- | |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
| Attachments: | attachment-8337-0.html | ||
URGH... I'm an idiot... that's what occurs when your doing sysadmin stuff at 1am.
scontrol release <jobid>_<taskid>
does the trick.
There is no stupid question. :-) The input format among commands is indeed not consistent. On April 13, 2014 10:50:12 AM PDT, bugs@schedmd.com wrote: >http://bugs.schedmd.com/show_bug.cgi?id=695 > >Stuart Midgley <stuartm@dugeo.com> changed: > > What |Removed |Added >---------------------------------------------------------------------------- > Status|UNCONFIRMED |RESOLVED > Resolution|--- |INFOGIVEN > >--- Comment #1 from Stuart Midgley <stuartm@dugeo.com> --- >URGH... I'm an idiot... that's what occurs when your doing sysadmin >stuff at >1am. > > scontrol release <jobid>_<taskid> > >does the trick. > >-- >You are receiving this mail because: >You are on the CC list for the bug. >You are the assignee for the bug. >You are watching someone on the CC list of the bug. >You are watching the assignee of the bug. |
Created attachment 745 [details] attachment-8337-0.html 20140414012355 bud30:~> squeue -t SE JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 566710_211 teambm a_harFM_ yanaz SE 0:00 1 (JobHeldUser) 566711_216 teambm a_harFM_ yanaz SE 0:00 1 (JobHeldUser) 566712_233 teambm a_harFM_ yanaz SE 0:00 1 (JobHeldUser) 566714_266 teambm a_harFM_ yanaz SE 0:00 1 (JobHeldUser) 566715_270 teambm a_harFM_ yanaz SE 0:00 1 (JobHeldUser) 566716_277 teambm a_harFM_ yanaz SE 0:00 1 (JobHeldUser) 567248_1 teambm a_600_Fi michaeld SE 0:00 1 (JobHeldUser) 567313_2 teambm a_600_Fi michaeld SE 0:00 1 (JobHeldUser) 622761_1000 teambm dc_5_35 bjornm SE 0:00 1 (JobHeldUser) 622772_1010 teambm dc_5_35 bjornm SE 0:00 1 (JobHeldUser) 622783_1020 teambm dc_5_35 bjornm SE 0:00 1 (JobHeldUser) 622794_1030 teambm dc_5_35 bjornm SE 0:00 1 (JobHeldUser) 622805_1040 teambm dc_5_35 bjornm SE 0:00 1 (JobHeldUser) 622816_1050 teambm dc_5_35 bjornm SE 0:00 1 (JobHeldUser) 622827_1060 teambm dc_5_35 bjornm SE 0:00 1 (JobHeldUser) 622838_1070 teambm dc_5_35 bjornm SE 0:00 1 (JobHeldUser) 622849_1080 teambm dc_5_35 bjornm SE 0:00 1 (JobHeldUser) 622860_1090 teambm dc_5_35 bjornm SE 0:00 1 (JobHeldUser) 622871_1100 teambm dc_5_35 bjornm SE 0:00 1 (JobHeldUser) 622882_1110 teambm dc_5_35 bjornm SE 0:00 1 (JobHeldUser) 623025_1240 teambm dc_5_35 bjornm SE 0:00 1 (JobHeldUser) 623080_1008 teambm dp_5_35 bjornm SE 0:00 1 (JobHeldUser) 623146_[1030-1039 teambm dp_5_35 bjornm SE 0:00 1 (JobHeldUser) 623168_[1040-1049 teambm dp_5_35 bjornm SE 0:00 1 (JobHeldUser) 623190_[1050-1059 teambm dp_5_35 bjornm SE 0:00 1 (JobHeldUser) 567249_107 teamswanM m_600_Fi michaeld SE 0:00 1 (JobHeldUser) 20140414012404 bud30:~> scontrol show jobid=567249_107 JobId=567291 ArrayJobId=567249 ArrayTaskId=107 Name=m_600_FinalMig_2_EvenIL_OddCL UserId=michaeld(1260) GroupId=teambm(2102) Priority=0 Nice=1014 Account=(null) QOS=normal JobState=SPECIAL_EXIT Reason=JobHeldUser Dependency=(null) Requeue=1 Restarts=1 BatchFlag=1 ExitCode=100:0 RunTime=00:00:00 TimeLimit=01:00:00 TimeMin=N/A SubmitTime=2014-04-13T22:43:11 EligibleTime=2014-04-13T22:43:12 StartTime=Unknown EndTime=2014-04-14T01:01:48 PreemptTime=None SuspendTime=None SecsPreSuspend=0 Partition=teamswanMig AllocNode:Sid=bud30:8489 ReqNodeList=(null) ExcNodeList=(null) NodeList=clus274 BatchHost=clus274 NumNodes=1 NumCPUs=32 CPUs/Task=1 ReqB:S:C:T=0:0:*:* Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=0 MinCPUsNode=1 MinMemoryNode=15661M MinTmpDiskNode=0 Features=(null) Gres=(null) Reservation=(null) Shared=0 Contiguous=0 Licenses=(null) Network=(null) Command=/p3/cue/zeebriesPr_002/imaging/600_FinalMig/040migOut/jobs/rj.m_600_FinalMig_2_EvenIL_OddCL.UGutjW WorkDir=/p3/cue/zeebriesPr_002/imaging/600_FinalMig/040migOut/jobs Comment=sge job id 2057393 StdErr=/p3/cue/zeebriesPr_002/imaging/600_FinalMig/040migOut/jobs/logs/m_600_FinalMig_2_EvenIL_OddCL/michaeld/m_600_FinalMig_2_EvenIL_ StdIn=/dev/null StdOut=/p3/cue/zeebriesPr_002/imaging/600_FinalMig/040migOut/jobs/logs/m_600_FinalMig_2_EvenIL_OddCL/michaeld/m_600_FinalMig_2_EvenIL_ 20140414012445 bud30:~> sudo -s [sudo] password for stuartm: 140414012457 bud30:stuartm# export PATH=/d/sw/slurm/latest/sbin:/d/sw/slurm/latest/bin:$PATH 140414012510 bud30:stuartm# scontrol release job=567249_107 Invalid job id specified (job=567249_107) slurm_suspend error: No error 140414012529 bud30:stuartm# scontrol release job=567291_107 Invalid job id specified (job=567291_107) slurm_suspend error: No error