| Summary: | slurmctld too many open files | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Matt Ezell <ezellma> |
| Component: | slurmctld | Assignee: | Nate Rini <nate> |
| Status: | RESOLVED FIXED | QA Contact: | |
| Severity: | 3 - Medium Impact | ||
| Priority: | --- | CC: | alex, brian.gilmer, nate, vergaravg |
| Version: | 21.08.5 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| See Also: | https://bugs.schedmd.com/show_bug.cgi?id=12804 | ||
| Site: | ORNL-OLCF | Slinky Site: | --- |
| Alineos Sites: | --- | Atos/Eviden Sites: | --- |
| Confidential Site: | --- | Coreweave sites: | --- |
| Cray Sites: | --- | DS9 clusters: | --- |
| Google sites: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | NoveTech Sites: | --- |
| Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Tzag Elita Sites: | --- |
| Linux Distro: | --- | Machine Name: | |
| CLE Version: | Version Fixed: | 21.08.6,22.05pre1 | |
| Target Release: | --- | DevPrio: | --- |
| Emory-Cloud Sites: | --- | ||
|
Description
Matt Ezell
2022-01-22 09:53:00 MST
Please also provide /proc/../status for slurmctld. (In reply to Nate Rini from comment #1) > Please also provide /proc/../status for slurmctld. if you prefer copy and paste: > cat /proc/$(pgrep slurmctld)/status (In reply to Nate Rini from comment #1) > Please also provide /proc/../status for slurmctld. We just restarted the controller (not the compute nodes) and see this message. [root@slurm1.frontier ~]# cat /proc/$(pgrep slurmctld)/status Name: slurmctld Umask: 0022 State: S (sleeping) Tgid: 16354 Ngid: 0 Pid: 16354 PPid: 1 TracerPid: 0 Uid: 6826 6826 6826 6826 Gid: 9526 9526 9526 9526 FDSize: 4096 Groups: 2046 2075 2324 9526 22738 24121 27480 27493 NStgid: 16354 NSpid: 16354 NSpgid: 16354 NSsid: 16354 VmPeak: 29258768 kB VmSize: 17084096 kB VmLck: 0 kB VmPin: 0 kB VmHWM: 296176 kB VmRSS: 231660 kB RssAnon: 223572 kB RssFile: 4 kB RssShmem: 8084 kB VmData: 292440 kB VmStk: 132 kB VmExe: 1044 kB VmLib: 6564 kB VmPTE: 1860 kB VmSwap: 0 kB HugetlbPages: 0 kB HugetlbResvPages: 0 kB CoreDumping: 0 THP_enabled: 1 Threads: 18 SigQ: 1/1024711 SigPnd: 0000000000000000 ShdPnd: 0000000000010000 SigBlk: 0000000000897827 SigIgn: 0000000000001000 SigCgt: 0000000180000200 CapInh: 0000000000000000 CapPrm: 0000000000000000 CapEff: 0000000000000000 CapBnd: 000000ffffffffff CapAmb: 0000000000000000 NoNewPrivs: 0 Seccomp: 0 Speculation_Store_Bypass: thread vulnerable Cpus_allowed: ffffffff Cpus_allowed_list: 0-31 Mems_allowed: 00000000,00000001 Mems_allowed_list: 0 voluntary_ctxt_switches: 21446 nonvoluntary_ctxt_switches: 36 (In reply to Matt Ezell from comment #3) > (In reply to Nate Rini from comment #1) > > Please also provide /proc/../status for slurmctld. > [root@slurm1.frontier ~]# cat /proc/$(pgrep slurmctld)/status Please also call: > ls -la /proc/$(pgrep slurmctld)/fd (In reply to Nate Rini from comment #4) > Please also call: > > ls -la /proc/$(pgrep slurmctld)/fd Sometimes we don't see many FDs: [root@slurm1.frontier ~]# ls -la /proc/$(pgrep slurmctld)/fd | wc -l 18 And sometimes we do: [root@slurm1.frontier ~]# ls -la /proc/$(pgrep slurmctld)/fd | wc -l 4099 When the number is high, the FDs are mostly sockets: [root@slurm1.frontier ~]# ls -la /proc/$(pgrep slurmctld)/fd | tail lrwx------ 1 slurm slurm 64 Jan 25 11:33 990 -> socket:[17376908] lrwx------ 1 slurm slurm 64 Jan 25 11:33 991 -> socket:[20265040] lrwx------ 1 slurm slurm 64 Jan 25 11:33 992 -> socket:[19701217] lrwx------ 1 slurm slurm 64 Jan 25 11:33 993 -> socket:[19572051] lrwx------ 1 slurm slurm 64 Jan 25 11:33 994 -> socket:[20267092] lrwx------ 1 slurm slurm 64 Jan 25 11:33 995 -> socket:[19703838] lrwx------ 1 slurm slurm 64 Jan 25 11:33 996 -> socket:[19570961] lrwx------ 1 slurm slurm 64 Jan 25 11:33 997 -> socket:[19698923] lrwx------ 1 slurm slurm 64 Jan 25 11:33 998 -> socket:[17376909] lrwx------ 1 slurm slurm 64 Jan 25 11:33 999 -> socket:[19698924] (In reply to Matt Ezell from comment #5) > When the number is high, the FDs are mostly sockets: Okay, so nothing unexpected here. > [root@slurm1.frontier ~]# ls -la /proc/$(pgrep slurmctld)/fd | wc -l > 4099 Looking for that code that sets the soft limit in slurmctld. Please try this patch:
> diff --git a/src/slurmctld/controller.c b/src/slurmctld/controller.c
> index 5e15935..36aac2b 100644
> --- a/src/slurmctld/controller.c
> +++ b/src/slurmctld/controller.c
> @@ -948,7 +948,7 @@ static void _init_config(void)
> {
> struct rlimit rlim;
>
> - rlimits_adjust_nofile();
> + rlimits_use_max_nofile();
> if (getrlimit(RLIMIT_CORE, &rlim) == 0) {
> rlim.rlim_cur = rlim.rlim_max;
> (void) setrlimit(RLIMIT_CORE, &rlim);
(In reply to Nate Rini from comment #7) > Please try this patch: For various reasons I've been unable to try this yet, but I'm pretty confident it would fix the issue. I think the original reason open files were limited was due to slowness in closeall() after a fork when there are many possible fds. Hopefully with more functionality moving to slurmscriptd forking of slurmctld is not as common. Matt,
This fix is now upstream:
> https://github.com/SchedMD/slurm/commit/82f417450686b71f84b088c9d8e811237ca3336d
Closing ticket. Please respond if there are any more related issues.
Thanks,
--Nate
|