Ticket 9787

Summary: s_p_parse_file error srun in salloc
Product: Slurm Reporter: Matt Mix <mattmix>
Component: User CommandsAssignee: Director of Support <support>
Status: RESOLVED DUPLICATE QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: cinek
Version: 20.02.2   
Hardware: Linux   
OS: Linux   
See Also: https://bugs.schedmd.com/show_bug.cgi?id=9704
Site: MSI Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---
Attachments: slur.conf

Description Matt Mix 2020-09-09 13:18:53 MDT
When inside a salloc session, with SLURM_CONF set, trying to use srun stats a socket and then fails to read the config. 

mattmix@ln0004 ~>salloc
salloc: Granted job allocation 8114
mattmix@ln0004 ~>srun uptime
srun: s_p_parse_file: file "/proc/22989/fd/5" is empty
srun: error: ClusterName needs to be specified
srun: fatal: Unable to process configuration file
mattmix@ln0004 ~>ls -l /proc/22989/fd/5
lrwx------. 1 mattmix tech 64 Sep  9 14:14 /proc/22989/fd/5 -> socket:[8298873]
mattmix@ln0004 ~>echo $SLURM_CONF
/proc/22989/fd/5
Comment 1 Matt Mix 2020-09-09 13:20:23 MDT
Created attachment 15818 [details]
slur.conf
Comment 2 Marcin Stolarek 2020-09-10 04:51:58 MDT
Matt,

You're very likely hitting the same issue as in Bug 9704.

I'll just quote my reply from there (Bug 9704 comment 3):
>I can reproduce the issue and I have a patch that I'm passing to our QA queue. 
>Let me know if you're interested in testing it before the review competition.
> 
>However, client tools should work in config-less mode with just DNS entry 
>configured you may consider running a slurmd daemon on login node. This slurmd 
>doesn't have to be configured as a computing node in any partition, but will just 
>keep the files in /run/slurm/conf/slurm.conf up-to-date (after scontrol 
>reconfigure). This setup will result in a reduced number of RPCs send to 
>slurmctld, without that every execution of client utilities has to issue 
>REQUEST_CONFIG RPC to download the configuration before doing its job.

Let me know if you want to try the patch before its QA is completed.

cheers,
Marcin
Comment 3 Jason Booth 2020-09-10 09:19:02 MDT
Marking as a duplicate of bug #9704

*** This ticket has been marked as a duplicate of ticket 9704 ***