We've recently upgraded to 19.05.1. Per the documentation, adding and subtracting nodes, and other certain slurm.conf necessitate a global restart of all the slurmd's as well as slurmctld. When we do this via salt on 2000+ nodes we have this error: Aug 6 15:16:09 holy-slurm02 slurmctld[23313]: fatal: locks.c:128 lock_slurmctld: pthread_rwlock_rdlock(): Resource temporarily unavailable Aug 6 15:16:09 holy-slurm02 slurmctld[23313]: fatal: locks.c:128 lock_slurmctld: pthread_rwlock_rdlock(): Resource temporarily unavailable Aug 6 15:16:09 holy-slurm02 systemd[1]: slurmctld.service: main process exited, code=exited, status=1/FAILURE If I the restart slurmctld it works fine, but it seems that slurmctld doesn't deal with the the stampeding herd of restarts necessitated by a global simultaneous restart. Previous versions have always worked fine for global restarts so this is a new issue with 19.05.x. -Paul Edmon-
Paul, Did you perform any other configuration/environment changes upgrading Slurm? I assume that you're on "systemd distro", could you please share results of: >systemctl status slurmctld >cat /proc/SLURMCTLDPID/limits and slurmctld logs? cheers, Marcin
So here is a list of the changes we made to the conf (if you look at ticket 7532 you can see our full conf): RoutePlugin=route/topology TopologyPlugin=topology/tree |SlurmctldParameters=preempt_send_user_signal |||PrologFlags=Contain,X11 | |AccountingStorageTRES=Billing,CPU,Energy,Mem,Node,FS/Disk,FS/Lustre,Pages,VMem,IC/OFED,gres/gpu |AcctGatherInfinibandType=acct_gather_infiniband/ofed AcctGatherFilesystemType=acct_gather_filesystem/lustre | ||JobAcctGatherFrequency=task=30,network=30,filesystem=30 |||||LaunchParameters=mem_sort,slurmstepd_memlock_all ||||||DefCpuPerGPU=1 DefMemPerGPU=100 GpuFreqDef=low |||||||SelectType=select/cons_tres | ||| ||||PriorityFlags=NO_FAIR_TREE And adding permit_job_expansion and reduce_completing_frag to SchedulerParameters [root@holy-slurm02 slurm]# systemctl status slurmctld ● slurmctld.service - Slurm controller daemon Loaded: loaded (/usr/lib/systemd/system/slurmctld.service; enabled; vendor preset: disabled) Drop-In: /etc/systemd/system/slurmctld.service.d └─50-ulimit.conf Active: active (running) since Wed 2019-08-07 09:40:35 EDT; 23min ago Process: 155826 ExecStart=/usr/sbin/slurmctld $SLURMCTLD_OPTIONS (code=exited, status=0/SUCCESS) Main PID: 155828 (slurmctld) Tasks: 722 Memory: 10.4G CGroup: /system.slice/slurmctld.service └─155828 /usr/sbin/slurmctld Aug 07 10:04:29 holy-slurm02.rc.fas.harvard.edu slurmctld[155828]: select/cons_tres: _eval_nodes_topo: insufficient resources currently available for JobId=17843175 Aug 07 10:04:29 holy-slurm02.rc.fas.harvard.edu slurmctld[155828]: select/cons_tres: _eval_nodes_topo: insufficient resources currently available for JobId=17843175 Aug 07 10:04:29 holy-slurm02.rc.fas.harvard.edu slurmctld[155828]: select/cons_tres: _eval_nodes_topo: insufficient resources currently available for JobId=17843175 Aug 07 10:04:29 holy-slurm02.rc.fas.harvard.edu slurmctld[155828]: sched: Allocate JobId=17917266 NodeList=holy2b17201 #CPUs=4 Partition=hoekstra Aug 07 10:04:29 holy-slurm02.rc.fas.harvard.edu slurmctld[155828]: _slurm_rpc_submit_batch_job: JobId=17917267 InitPrio=1856652 usec=28043 Aug 07 10:04:30 holy-slurm02.rc.fas.harvard.edu slurmctld[155828]: sched: Allocate JobId=17917267 NodeList=holyitc14 #CPUs=1 Partition=itc_cluster Aug 07 10:04:30 holy-slurm02.rc.fas.harvard.edu slurmctld[155828]: _job_complete: JobId=17917234 WEXITSTATUS 1 Aug 07 10:04:30 holy-slurm02.rc.fas.harvard.edu slurmctld[155828]: _job_complete: JobId=17917234 done Aug 07 10:04:30 holy-slurm02.rc.fas.harvard.edu slurmctld[155828]: prolog_running_decr: Configuration for JobId=17917265 is complete Aug 07 10:04:30 holy-slurm02.rc.fas.harvard.edu slurmctld[155828]: Extending JobId=17917265 time limit by 1 secs for configuration [root@holy-slurm02 slurm]# cat /proc/155828/limits Limit Soft Limit Hard Limit Units Max cpu time unlimited unlimited seconds Max file size unlimited unlimited bytes Max data size unlimited unlimited bytes Max stack size unlimited unlimited bytes Max core file size unlimited unlimited bytes Max resident set unlimited unlimited bytes Max processes 1030065 1030065 processes Max open files 8192 8192 files Max locked memory 65536 65536 bytes Max address space unlimited unlimited bytes Max file locks unlimited unlimited locks Max pending signals 1030065 1030065 signals Max msgqueue size 819200 819200 bytes Max nice priority 0 0 Max realtime priority 0 0 Max realtime timeout unlimited unlimited us As for the slurmctld.logs Those are massive, so sending them all may not be productive. Do you have a smaller timeslice of the logs you want to see? -Paul Edmon- |||| On 8/7/19 3:55 AM, bugs@schedmd.com wrote: > > *Comment # 2 <https://bugs.schedmd.com/show_bug.cgi?id=7528#c2> on bug > 7528 <https://bugs.schedmd.com/show_bug.cgi?id=7528> from Marcin > Stolarek <mailto:cinek@schedmd.com> * > Paul, > > Did you perform any other configuration/environment changes upgrading Slurm? > I assume that you're on "systemd distro", could you please share results of: > >systemctl status slurmctld >cat /proc/SLURMCTLDPID/limits > and slurmctld logs? > > cheers, > Marcin > ------------------------------------------------------------------------ > You are receiving this mail because: > > * You reported the bug. >
Paul, Did you try to perform "global restart" of slurmd after patch from bug 7532? I'm not 100% sure, but it may be related. cheers, Marcin
Yes, everything is stable since we got that patch. So I would mark this one as resolved. -Paul Edmon- On 8/28/19 7:57 AM, bugs@schedmd.com wrote: > > *Comment # 4 <https://bugs.schedmd.com/show_bug.cgi?id=7528#c4> on bug > 7528 <https://bugs.schedmd.com/show_bug.cgi?id=7528> from Marcin > Stolarek <mailto:cinek@schedmd.com> * > Paul, > > Did you try to perform "global restart" of slurmd after patch frombug 7532 <show_bug.cgi?id=7532>? > I'm not 100% sure, but it may be related. > > cheers, > Marcin > ------------------------------------------------------------------------ > You are receiving this mail because: > > * You reported the bug. >
Thanks for quick reply. I'm closing this as duplicate. cheers, Marcin *** This ticket has been marked as a duplicate of ticket 7532 ***