| Summary: | "AccountNotAllowed" after making QOS changes | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Jeff White <jeff.white> |
| Component: | Accounting | Assignee: | Tim Wickberg <tim> |
| Status: | RESOLVED INFOGIVEN | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | ||
| Version: | 15.08.12 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | Washington State University | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | Target Release: | --- | |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
| Attachments: | slurm.conf | ||
Created attachment 3376 [details]
slurm.conf
The user isn't considered to be a member of the parent account in the hierarchy for access control purposes. This is why you're being denied access. You'll note you're also not allowed to submit jobs under that account, only under the specific accounts you've been granted access to. I don't believe it's ever been possible to do the access control through the account hierarchy the way you're attempting to. I can add an enhancement request to explore this with some additional configuration options, but that would need to wait for 17.02 at the earliest before that was changed, and may need some site to sponsor that work. - Tim |
I have this partition: $ scontrol show partition kamiak PartitionName=kamiak AllowGroups=ALL AllowAccounts=investor,noninvestor AllowQos=ALL AllocNodes=ALL Default=NO QoS=kamiak DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO MaxNodes=UNLIMITED MaxTime=7-00:00:00 MinNodes=1 LLN=NO MaxCPUsPerNode=UNLIMITED Nodes=cn[1-35,37-76] Priority=1000 RootOnly=NO ReqResv=NO Shared=NO PreemptMode=REQUEUE State=UP TotalCPUs=1500 TotalNodes=75 SelectTypeParameters=N/A DefMemPerCPU=6441 MaxMemPerNode=UNLIMITED It allows two accounts: investor,noninvestor. My user is associated with an account: $ sacctmgr show user jeff.white User Def Acct Admin ---------- ---------- --------- jeff.white farnsworth None ... and that account is a child of one of the ones which are allowed (investor): Cluster -------------Account Partition QOS Def QOS ---------- -------------------- ---------- -------------------- --------- kamiak root normal kamiak -root normal kamiak -all normal kamiak -investor normal kamiak --beckman beckman kamiak ---beckman beckman kamiak --cahnrs cahnrs kamiak ---cahnrs cahnrs kamiak --cas cas kamiak ---cas cas kamiak --catalysis catalysis kamiak ---catalysis catalysis kamiak --farnsworth farnsworth kamiak ---farnsworth farnsworth kamiak --katz katz kamiak ---katz katz kamiak --lofgren lofgren kamiak ---lofgren lofgren kamiak --popgenom popgenom kamiak ---popgenom popgenom kamiak --vcea vcea kamiak ---vcea vcea kamiak -noninvestor noninvestor kamiak --noninvestor noninvestor kamiak --admin admin kamiak ---admin admin My job gets the correct account assigned but the jobs fails to run with "Reason=AccountNotAllowed". # scontrol show job 90676 JobId=90676 JobName=run_burn.sh UserId=jeff.white(8003) GroupId=its_p_sto_qa_hpc_kamiak-its_staff(7000) Priority=1 Nice=0 Account=farnsworth QOS=farnsworth JobState=PENDING Reason=AccountNotAllowed Dependency=(null) Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0 RunTime=00:00:00 TimeLimit=01:00:00 TimeMin=N/A SubmitTime=2016-08-02T13:16:40 EligibleTime=2016-08-02T13:16:40 StartTime=Unknown EndTime=Unknown PreemptTime=None SuspendTime=None SecsPreSuspend=0 Partition=kamiak AllocNode:Sid=login-p1n02:48755 ReqNodeList=(null) ExcNodeList=(null) NodeList=(null) NumNodes=1-1 NumCPUs=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:* TRES=cpu=1,mem=200,node=1 Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=* MinCPUsNode=1 MinMemoryCPU=200M MinTmpDiskNode=0 Features=(null) Gres=(null) Reservation=(null) Shared=OK Contiguous=0 Licenses=(null) Network=(null) Command=/home/jeff.white/burn/run_burn.sh WorkDir=/home/jeff.white StdErr=/home/jeff.white/burn/90676_hostname.err StdIn=/dev/null StdOut=/home/jeff.white/burn/90676_hostname.out Power= SICP=0 Why does this not work?