Ticket 9787 - s_p_parse_file error srun in salloc
Summary: s_p_parse_file error srun in salloc
Status: RESOLVED DUPLICATE of ticket 9704
Alias: None
Product: Slurm
Classification: Unclassified
Component: User Commands (show other tickets)
Version: 20.02.2
Hardware: Linux Linux
: 4 - Minor Issue
Assignee: Director of Support
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2020-09-09 13:18 MDT by Matt Mix
Modified: 2020-09-10 09:19 MDT (History)
1 user (show)

See Also:
Site: MSI
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments
slur.conf (2.13 KB, text/plain)
2020-09-09 13:20 MDT, Matt Mix
Details

Note You need to log in before you can comment on or make changes to this ticket.
Description Matt Mix 2020-09-09 13:18:53 MDT
When inside a salloc session, with SLURM_CONF set, trying to use srun stats a socket and then fails to read the config. 

mattmix@ln0004 ~>salloc
salloc: Granted job allocation 8114
mattmix@ln0004 ~>srun uptime
srun: s_p_parse_file: file "/proc/22989/fd/5" is empty
srun: error: ClusterName needs to be specified
srun: fatal: Unable to process configuration file
mattmix@ln0004 ~>ls -l /proc/22989/fd/5
lrwx------. 1 mattmix tech 64 Sep  9 14:14 /proc/22989/fd/5 -> socket:[8298873]
mattmix@ln0004 ~>echo $SLURM_CONF
/proc/22989/fd/5
Comment 1 Matt Mix 2020-09-09 13:20:23 MDT
Created attachment 15818 [details]
slur.conf
Comment 2 Marcin Stolarek 2020-09-10 04:51:58 MDT
Matt,

You're very likely hitting the same issue as in Bug 9704.

I'll just quote my reply from there (Bug 9704 comment 3):
>I can reproduce the issue and I have a patch that I'm passing to our QA queue. 
>Let me know if you're interested in testing it before the review competition.
> 
>However, client tools should work in config-less mode with just DNS entry 
>configured you may consider running a slurmd daemon on login node. This slurmd 
>doesn't have to be configured as a computing node in any partition, but will just 
>keep the files in /run/slurm/conf/slurm.conf up-to-date (after scontrol 
>reconfigure). This setup will result in a reduced number of RPCs send to 
>slurmctld, without that every execution of client utilities has to issue 
>REQUEST_CONFIG RPC to download the configuration before doing its job.

Let me know if you want to try the patch before its QA is completed.

cheers,
Marcin
Comment 3 Jason Booth 2020-09-10 09:19:02 MDT
Marking as a duplicate of bug #9704

*** This ticket has been marked as a duplicate of ticket 9704 ***