Ticket 2963

Summary: "AccountNotAllowed" after making QOS changes
Product: Slurm Reporter: Jeff White <jeff.white>
Component: AccountingAssignee: Tim Wickberg <tim>
Status: RESOLVED INFOGIVEN QA Contact:
Severity: 4 - Minor Issue    
Priority: ---    
Version: 15.08.12   
Hardware: Linux   
OS: Linux   
Site: Washington State University Slinky Site: ---
Alineos Sites: --- Atos/Eviden Sites: ---
Confidential Site: --- Coreweave sites: ---
Cray Sites: --- DS9 clusters: ---
Google sites: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- NoveTech Sites: ---
Nvidia HWinf-CS Sites: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Tzag Elita Sites: ---
Linux Distro: --- Machine Name:
CLE Version: Version Fixed:
Target Release: --- DevPrio: ---
Emory-Cloud Sites: ---
Attachments: slurm.conf

Description Jeff White 2016-08-02 14:23:01 MDT
I have this partition:

$ scontrol show partition kamiak                                                             
PartitionName=kamiak
   AllowGroups=ALL AllowAccounts=investor,noninvestor AllowQos=ALL
   AllocNodes=ALL Default=NO QoS=kamiak
   DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO
   MaxNodes=UNLIMITED MaxTime=7-00:00:00 MinNodes=1 LLN=NO MaxCPUsPerNode=UNLIMITED
   Nodes=cn[1-35,37-76]
   Priority=1000 RootOnly=NO ReqResv=NO Shared=NO PreemptMode=REQUEUE
   State=UP TotalCPUs=1500 TotalNodes=75 SelectTypeParameters=N/A
   DefMemPerCPU=6441 MaxMemPerNode=UNLIMITED

It allows two accounts: investor,noninvestor.  My user is associated with an account:

$ sacctmgr show user jeff.white
      User   Def Acct     Admin 
---------- ---------- --------- 
jeff.white farnsworth      None

... and that account is a child of one of the ones which are allowed (investor):

   Cluster -------------Account  Partition                  QOS   Def QOS 
---------- -------------------- ---------- -------------------- --------- 
    kamiak root                                          normal           
    kamiak -root                                         normal           
    kamiak -all                                          normal           
    kamiak -investor                                     normal           
    kamiak --beckman                                    beckman           
    kamiak ---beckman                                   beckman           
    kamiak --cahnrs                                      cahnrs           
    kamiak ---cahnrs                                     cahnrs           
    kamiak --cas                                            cas           
    kamiak ---cas                                           cas           
    kamiak --catalysis                                catalysis           
    kamiak ---catalysis                               catalysis           
    kamiak --farnsworth                              farnsworth           
    kamiak ---farnsworth                             farnsworth           
    kamiak --katz                                          katz           
    kamiak ---katz                                         katz           
    kamiak --lofgren                                    lofgren           
    kamiak ---lofgren                                   lofgren           
    kamiak --popgenom                                  popgenom           
    kamiak ---popgenom                                 popgenom           
    kamiak --vcea                                          vcea           
    kamiak ---vcea                                         vcea           
    kamiak -noninvestor                             noninvestor           
    kamiak --noninvestor                            noninvestor           
    kamiak --admin                                        admin           
    kamiak ---admin                                       admin

My job gets the correct account assigned but the jobs fails to run with "Reason=AccountNotAllowed".

# scontrol show job 90676
JobId=90676 JobName=run_burn.sh
   UserId=jeff.white(8003) GroupId=its_p_sto_qa_hpc_kamiak-its_staff(7000)
   Priority=1 Nice=0 Account=farnsworth QOS=farnsworth
   JobState=PENDING Reason=AccountNotAllowed Dependency=(null)
   Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
   RunTime=00:00:00 TimeLimit=01:00:00 TimeMin=N/A
   SubmitTime=2016-08-02T13:16:40 EligibleTime=2016-08-02T13:16:40
   StartTime=Unknown EndTime=Unknown
   PreemptTime=None SuspendTime=None SecsPreSuspend=0
   Partition=kamiak AllocNode:Sid=login-p1n02:48755
   ReqNodeList=(null) ExcNodeList=(null)
   NodeList=(null)
   NumNodes=1-1 NumCPUs=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
   TRES=cpu=1,mem=200,node=1
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
   MinCPUsNode=1 MinMemoryCPU=200M MinTmpDiskNode=0
   Features=(null) Gres=(null) Reservation=(null)
   Shared=OK Contiguous=0 Licenses=(null) Network=(null)
   Command=/home/jeff.white/burn/run_burn.sh
   WorkDir=/home/jeff.white
   StdErr=/home/jeff.white/burn/90676_hostname.err
   StdIn=/dev/null
   StdOut=/home/jeff.white/burn/90676_hostname.out
   Power= SICP=0

Why does this not work?
Comment 1 Jeff White 2016-08-02 14:24:13 MDT
Created attachment 3376 [details]
slurm.conf
Comment 2 Tim Wickberg 2016-08-02 15:47:41 MDT
The user isn't considered to be a member of the parent account in the hierarchy for access control purposes. This is why you're being denied access. You'll note you're also not allowed to submit jobs under that account, only under the specific accounts you've been granted access to.

I don't believe it's ever been possible to do the access control through the account hierarchy the way you're attempting to. I can add an enhancement request to explore this with some additional configuration options, but that would need to wait for 17.02 at the earliest before that was changed, and may need some site to sponsor that work.

- Tim