On RedHat 7, when we try to start the slurmd on a compute node it does not work. The slurmctld DOES work on the management node. We think it may have something to do with cgroups, but can not confirm that. Here is the information we get when we try to start it. Also, included are the results of the journalctl -xn command and some information about cgroup. The systemtld does not give much information just running systemctl start slurm, but when you do the systemctl status slurm.service we get more info as shown here. [root@rama25 ~]# systemctl status slurm.service slurm.service - LSB: slurm daemon management Loaded: loaded (/etc/rc.d/init.d/slurm) Active: failed (Result: resources) since Fri 2014-10-17 16:58:15 CEST; 10s ago Process: 10829 ExecStart=/etc/rc.d/init.d/slurm start (code=exited, status=0/SUCCESS) Oct 17 16:58:15 rama25.bullx slurm[10829]: starting slurmd: Oct 17 16:58:15 rama25.bullx systemd[1]: PID file /var/run/slurmctld.pid not readable (yet?) after start. Oct 17 16:58:15 rama25.bullx systemd[1]: Failed to start LSB: slurm daemon management. Oct 17 16:58:15 rama25.bullx systemd[1]: Unit slurm.service entered failed state. Oct 17 16:58:20 rama25.bullx systemd[1]: Stopped LSB: slurm daemon management. [root@rama25 ~]# systemctl restart slurm Job for slurm.service failed. See 'systemctl status slurm.service' and 'journalctl -xn' for details. [root@rama25 ~]# systemctl start slurm Job for slurm.service failed. See 'systemctl status slurm.service' and 'journalctl -xn' for details. [root@rama25 ~]# systemctl stop slurm [root@rama25 ~]# ps -aux|grep slurm root 3368 0.0 0.0 198948 2412 ? Sl 16:41 0:00 slurmd root 10873 0.0 0.0 112644 980 pts/1 S+ 17:00 0:00 grep --color=auto slurm 'journalctl -xn' -- Logs begin at Fri 2014-10-17 14:32:02 CEST, end at Fri 2014-10-17 17:04:29 CEST. -- Oct 17 17:01:01 rama25.bullx run-parts(/etc/cron.hourly)[10885]: finished 0anacron Oct 17 17:01:01 rama25.bullx run-parts(/etc/cron.hourly)[10887]: starting 0yum-hourly.cron Oct 17 17:01:01 rama25.bullx run-parts(/etc/cron.hourly)[10891]: finished 0yum-hourly.cron Oct 17 17:01:39 rama25.bullx collectd[1517]: Filter subsystem: Built-in target `write': Dispatching value to all write plugins failed with status -1. Oct 17 17:03:46 rama25.bullx collectd[1517]: Filter subsystem: Built-in target `write': Dispatching value to all write plugins failed with status -1. Oct 17 17:04:29 rama25.bullx systemd[1]: Starting LSB: slurm daemon management... -- Subject: Unit slurm.service has begun with start-up -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit slurm.service has begun starting up. Oct 17 17:04:29 rama25.bullx slurm[10904]: starting slurmd: Oct 17 17:04:29 rama25.bullx systemd[1]: PID file /var/run/slurmctld.pid not readable (yet?) after start. Oct 17 17:04:29 rama25.bullx systemd[1]: Failed to start LSB: slurm daemon management. -- Subject: Unit slurm.service has failed -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit slurm.service has failed. -- -- The result is failed. Oct 17 17:04:29 rama25.bullx systemd[1]: Unit slurm.service entered failed state. I don't know why it checks for slurmctld.pid but I don't think this is the reason of the problem It should be more related to cgroups [root@rama25 ~]# cat /etc/slurm/cgroup.conf ### # # Slurm cgroup support configuration file # # See man slurm.conf and man cgroup.conf for further # information on cgroup configuration parameters #-- CgroupAutomount=yes CgroupReleaseAgentDir="/etc/slurm/cgroup/" ConstrainCores=yes ConstrainRAMSpace=no #CgroupMountOptions="cpuset" CgroupMountpoint=/sys/fs/cgroup/ [root@mslu-1 georgioy]# mount|grep cgroup tmpfs on /sys/fs/cgroup type tmpfs (rw,nosuid,nodev,noexec,seclabel,mode=755) cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd) cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpuacct,cpu) cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) cgroup on /sys/fs/cgroup/net_cls type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls) cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
This seems to be an systemctl problem. This commands hang: /bin/systemctl start slurm /bin/systemctl start slurm.service if you strace it you will see it is waiting for some reply from somebody :-) recvmsg(3, 0x7fffc47c64c0, MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable) poll([{fd=3, events=POLLIN}], 1, 4294967295) = 1 ([{fd=3, revents=POLLIN}]) recvmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"l\4\1\1q\0\0\0\241\5\0\0x\0\0\0\1\1o\0\31\0\0\0/org/fre"..., 2048}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 249 recvmsg(3, 0x7fffc47c68a0, MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable) poll([{fd=3, events=POLLIN}], 1, 4294967295^CProcess 5700 detached The workaround is to cd /etc/init.d and then run the script, in this case systemctl is out of the picture and everything works. I will try to find what's going on. I don't have any cgroup configured. Why do you think it is cgroup related? David
If you don't run slurmd and slurmctld together on any nodes, a workaround I found is to set SlurmdPidFile=/var/run/slurmctld.pid in slurm.conf. My guess is that systemd just takes the last pidfile comment in the file as the pid file of the service.
David and David, Thank you for the information and looking at the problem. The information I provided does show a problem with pid file, but we had already corrected that part of the problem. The work around to start it from the init.d script does work, but it is a difficult work around when we have a large cluster. Thanks for the info, Nancy
David B. I took cgroups out of my configuration and you are right, it doesn't change anything. So, I think we can take cgroups out of the equation. Nancy
I am playing around with the units file. I think we have to use those instead the /etc/init.d scripts. I will let you know when I got it to work. You may want to have a look at this meanwhile: https://wiki.archlinux.org/index.php/systemd David
Something simple like this works for me, I created 2 unit files: root@phobos /usr/lib/systemd/system>cat david.service [Unit] Description=David server After=syslog.target network.target auditd.service [Service] Type=forking ExecStart=/sbin/slurmctld -vvv Restart=on-failure RestartSec=42s [Install] WantedBy=multi-user.target and root@phobos /usr/lib/systemd/system>cat zebra.service [Unit] Description=Zebra server After=syslog.target network.target auditd.service [Service] Type=forking ExecStart=/sbin/slurmd -vvv Restart=on-failure RestartSec=42s [Install] WantedBy=multi-user.target one for the controller and another for the slurmd. Then start them: root@phobos /usr/lib/systemd/system>systemctl start david.service root@phobos /usr/lib/systemd/system>systemctl start zebra.service root@phobos /usr/lib/systemd/system>ps -ef|grep slurm|grep -v grep david 14726 1 0 14:19 ? 00:00:00 /sbin/slurmctld -vvv root 14783 1 0 14:20 ? 00:00:00 /sbin/slurmd -vvv root@phobos /usr/lib/systemd/system>sinfo PARTITION AVAIL TIMELIMIT NODES STATE NODELIST marte* up infinite 1 down* deimos marte* up infinite 1 idle phobos David
We're slurm newbies starting to set up slurm 14.11.0-0rc2 on a CentOS 7 nodes, which is going to be our next generation cluster setup. I can fully confirm the above bug in a very simple slurm setup. As a temporary workaround on a compute node I had to make this PidFile configuration in slurm.conf: SlurmdPidFile=/var/run/slurmctld.pid and start the daemons in the old way: cd /etc/init.d ./slurm start I'd love to get a proper bug fix, how's the chance of this? Thanks, Ole
Nancy, did you have a chance to try the service files I posted? David
This is not a Slurm problem but rather systemd. I figure out that systemd/systemctl is very sensitive to the existence of the PIDFile if this variable includes a directory that does not exist systemctld hangs after it starts the slurmd instead of returning an error. The solution is to specify a correct path or comment it out. David
David, then what's wrong with this: SlurmctldPidFile=/var/run/slurmctld.pid SlurmdPidFile=/var/run/slurmd.pid The /var/run/ directory surely exists already. Can you give an explicitly working example for slurm.conf?
I opened a systemd bug on this. I think systemd doesn't handle init scripts with multiple pid files listed. https://bugs.freedesktop.org/show_bug.cgi?id=85297
On CentOS 7 this works for me: root@phobos /usr/lib/systemd/system>cat slurmd.service [Unit] Description=Slurm node daemon After=network.target ConditionPathExists=/home/david/cluster/1411/linux/etc/slurm.conf [Service] Type=forking EnvironmentFile=/home/david/cluster/1411/linux/etc/defaults ExecStart=/home/david/cluster/1411/linux/sbin/slurmd $SLURMD_OPTIONS PIDFile=/var/slurm/pid/slurmd.pid [Install] WantedBy=multi-user.target root@phobos /usr/lib/systemd/system>cat slurmctld.service [Unit] Description=Slurm controller daemon After=network.target ConditionPathExists=/home/david/cluster/1411/linux/etc/slurm.conf [Service] Type=forking EnvironmentFile=/home/david/cluster/1411/linux/etc/defaults ExecStart=/home/david/cluster/1411/linux/sbin/slurmctld $SLURMCTLD_OPTIONS PIDFile=/var/slurm/pid/slurmctld.pid [Install] WantedBy=multi-user.target These are the templates Slurm installs in the etc directory where the examples are. Then run 'systemctl enable slurmd' and 'systemctl enable slurmctld' and start the services. When the machine reboots Slurmd is started as well. I don't know why they invented systemd or why they think it is a good idea, but that's not a question for me. :-) I was doing fine with /etc/rc.local. David
I'm having similar symptoms on CentOS 7 on my *controller* node (haven't tried worker nodes yet). I'm new to both systemd and slurm (administration), and was amused to see activity on this as recent as yesterday, which partly makes up for the sysadmin woes. Just to get on the same page, I'm curious what are the specific contents of the EnvironmentFile you have specified, and if there is a default that you are using? Thanks, Brandon
These files should be considered as templates just like the examples of slurm.conf and release_agent in the etc directory. In my testing I used a defaults file like this: david@phobos ~/cluster/1411/linux/etc>cat defaults SLURMCTLD_OPTIONS=-vvv SLURMD_OPTIONS=-vvv setting the command like options for the daemons. David
Please reopen this bug because it hasn't been resolved at all! I have confirmed that it's still present in SLURM 14.11.
From the systemd bug: "Hmm, listing two pidile entries in the headers is an extension that is not supported by systemd, sorry, and it's unlikely to be supported. your daemon really shouldn't ship thing with an extension like that..." On systemd systems you really need to set up the service files or use my pidfile workaround for Slurm to work correctly.
How about removing the two pidfile, and processname, tag lines in the slurm init.d script? brian@compy:~/slurm/14.11/slurm/etc$ git diff diff --git a/etc/init.d.slurm.in b/etc/init.d.slurm.in index 5387a9e..4741ccb 100644 --- a/etc/init.d.slurm.in +++ b/etc/init.d.slurm.in @@ -5,12 +5,6 @@ # manages exclusive access to a set of compute \ # resources and distributes work to those resources. # -# processname: @sbindir@/slurmd -# pidfile: /var/run/slurmd.pid -# -# processname: @sbindir@/slurmctld -# pidfile: /var/run/slurmctld.pid -# # config: /etc/sysconfig/slurm # ### BEGIN INIT INFO This makes the init script work for me in centos 7 and centos 6. The init script can grab the corresponding pids independent of the pidfile tag line. The status() function greps out the Slurm[ctl]dPid file from the slurm.conf and matches the pid in the file against the pid of the process with the given daemon name. And stop()/killproc() kills the daemon by using the daemon name to get the pid. It looks, and feels, safe to do. Can anyone see any adverse effects to removing the pidfile and processname tag lines?
Hello The bug is still in 14.11.8 and Centos7.1 As I read the systemd developer refuses to fix this. The workaround for now is to remove the Pid-file line if the corresponding service won't be started. But this means that we have different config files for different nodes. It would be nice if the service could be split up in two unit files Sven
Hi did you try the scripts in comment 12? David
(In reply to David Bigagli from comment #19) > Hi did you try the scripts in comment 12? > > David I've missed that the files are already installed. And indeed this looks much better. The only small issues: - it fails if the defaults file are not in place - the location of the pid file is not synced with the definition in /etc/slurm/slurm.conf but this is easly adjustable, so for me it is fixed. For the developer it would be nice if they could clean up the situation many thanks! sven
Hi, these issues are actually fixed as well. If you look in the source code in the etc directory you will find the .in files, e.g. slurmctld.service.in. During the configuration phase before the build these template files are filled with the configure options. The examples in comment 12 are after the software was configured. David
Created attachment 2173 [details] Fix for described problem
Comment on attachment 2173 [details] Fix for described problem As I was asked to confirm this bug I added my own screen captures in attachment which support Comment #20.
Yes this is a known issue the location of the pid file must be a valid one. The default directory in slurm.conf is /var/run if the path changes then the startup file must be updated accordingly otherwise systemd does not work properly. David