Summary: | systemctl start/stop does not work on RHEL 7 | ||
---|---|---|---|
Product: | Slurm | Reporter: | Nancy <nancy.kritkausky> |
Component: | slurmd | Assignee: | David Bigagli <david> |
Status: | RESOLVED INFOGIVEN | QA Contact: | |
Severity: | 3 - Medium Impact | ||
Priority: | --- | CC: | adam.huffman, brandon.barker, brian, da, david.gloe, doug.parisek, nancy.kritkausky, Ole.H.Nielsen, sven.sternberger, yiannis.georgiou |
Version: | 14.03.8 | ||
Hardware: | Linux | ||
OS: | Linux | ||
Site: | Universitat Dresden (Germany) | Alineos Sites: | --- |
Atos/Eviden Sites: | --- | Confidential Site: | --- |
Coreweave sites: | --- | Cray Sites: | --- |
DS9 clusters: | --- | HPCnow Sites: | --- |
HPE Sites: | --- | IBM Sites: | --- |
NOAA SIte: | --- | NoveTech Sites: | --- |
Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
Recursion Pharma Sites: | --- | SFW Sites: | --- |
SNIC sites: | --- | Linux Distro: | --- |
Machine Name: | CLE Version: | ||
Version Fixed: | Target Release: | --- | |
DevPrio: | --- | Emory-Cloud Sites: | --- |
Attachments: | Fix for described problem |
Description
Nancy
2014-10-17 04:26:27 MDT
This seems to be an systemctl problem. This commands hang: /bin/systemctl start slurm /bin/systemctl start slurm.service if you strace it you will see it is waiting for some reply from somebody :-) recvmsg(3, 0x7fffc47c64c0, MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable) poll([{fd=3, events=POLLIN}], 1, 4294967295) = 1 ([{fd=3, revents=POLLIN}]) recvmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"l\4\1\1q\0\0\0\241\5\0\0x\0\0\0\1\1o\0\31\0\0\0/org/fre"..., 2048}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 249 recvmsg(3, 0x7fffc47c68a0, MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable) poll([{fd=3, events=POLLIN}], 1, 4294967295^CProcess 5700 detached The workaround is to cd /etc/init.d and then run the script, in this case systemctl is out of the picture and everything works. I will try to find what's going on. I don't have any cgroup configured. Why do you think it is cgroup related? David If you don't run slurmd and slurmctld together on any nodes, a workaround I found is to set SlurmdPidFile=/var/run/slurmctld.pid in slurm.conf. My guess is that systemd just takes the last pidfile comment in the file as the pid file of the service. David and David, Thank you for the information and looking at the problem. The information I provided does show a problem with pid file, but we had already corrected that part of the problem. The work around to start it from the init.d script does work, but it is a difficult work around when we have a large cluster. Thanks for the info, Nancy David B. I took cgroups out of my configuration and you are right, it doesn't change anything. So, I think we can take cgroups out of the equation. Nancy I am playing around with the units file. I think we have to use those instead the /etc/init.d scripts. I will let you know when I got it to work. You may want to have a look at this meanwhile: https://wiki.archlinux.org/index.php/systemd David Something simple like this works for me, I created 2 unit files: root@phobos /usr/lib/systemd/system>cat david.service [Unit] Description=David server After=syslog.target network.target auditd.service [Service] Type=forking ExecStart=/sbin/slurmctld -vvv Restart=on-failure RestartSec=42s [Install] WantedBy=multi-user.target and root@phobos /usr/lib/systemd/system>cat zebra.service [Unit] Description=Zebra server After=syslog.target network.target auditd.service [Service] Type=forking ExecStart=/sbin/slurmd -vvv Restart=on-failure RestartSec=42s [Install] WantedBy=multi-user.target one for the controller and another for the slurmd. Then start them: root@phobos /usr/lib/systemd/system>systemctl start david.service root@phobos /usr/lib/systemd/system>systemctl start zebra.service root@phobos /usr/lib/systemd/system>ps -ef|grep slurm|grep -v grep david 14726 1 0 14:19 ? 00:00:00 /sbin/slurmctld -vvv root 14783 1 0 14:20 ? 00:00:00 /sbin/slurmd -vvv root@phobos /usr/lib/systemd/system>sinfo PARTITION AVAIL TIMELIMIT NODES STATE NODELIST marte* up infinite 1 down* deimos marte* up infinite 1 idle phobos David We're slurm newbies starting to set up slurm 14.11.0-0rc2 on a CentOS 7 nodes, which is going to be our next generation cluster setup. I can fully confirm the above bug in a very simple slurm setup. As a temporary workaround on a compute node I had to make this PidFile configuration in slurm.conf: SlurmdPidFile=/var/run/slurmctld.pid and start the daemons in the old way: cd /etc/init.d ./slurm start I'd love to get a proper bug fix, how's the chance of this? Thanks, Ole Nancy, did you have a chance to try the service files I posted? David This is not a Slurm problem but rather systemd. I figure out that systemd/systemctl is very sensitive to the existence of the PIDFile if this variable includes a directory that does not exist systemctld hangs after it starts the slurmd instead of returning an error. The solution is to specify a correct path or comment it out. David David, then what's wrong with this: SlurmctldPidFile=/var/run/slurmctld.pid SlurmdPidFile=/var/run/slurmd.pid The /var/run/ directory surely exists already. Can you give an explicitly working example for slurm.conf? I opened a systemd bug on this. I think systemd doesn't handle init scripts with multiple pid files listed. https://bugs.freedesktop.org/show_bug.cgi?id=85297 On CentOS 7 this works for me: root@phobos /usr/lib/systemd/system>cat slurmd.service [Unit] Description=Slurm node daemon After=network.target ConditionPathExists=/home/david/cluster/1411/linux/etc/slurm.conf [Service] Type=forking EnvironmentFile=/home/david/cluster/1411/linux/etc/defaults ExecStart=/home/david/cluster/1411/linux/sbin/slurmd $SLURMD_OPTIONS PIDFile=/var/slurm/pid/slurmd.pid [Install] WantedBy=multi-user.target root@phobos /usr/lib/systemd/system>cat slurmctld.service [Unit] Description=Slurm controller daemon After=network.target ConditionPathExists=/home/david/cluster/1411/linux/etc/slurm.conf [Service] Type=forking EnvironmentFile=/home/david/cluster/1411/linux/etc/defaults ExecStart=/home/david/cluster/1411/linux/sbin/slurmctld $SLURMCTLD_OPTIONS PIDFile=/var/slurm/pid/slurmctld.pid [Install] WantedBy=multi-user.target These are the templates Slurm installs in the etc directory where the examples are. Then run 'systemctl enable slurmd' and 'systemctl enable slurmctld' and start the services. When the machine reboots Slurmd is started as well. I don't know why they invented systemd or why they think it is a good idea, but that's not a question for me. :-) I was doing fine with /etc/rc.local. David I'm having similar symptoms on CentOS 7 on my *controller* node (haven't tried worker nodes yet). I'm new to both systemd and slurm (administration), and was amused to see activity on this as recent as yesterday, which partly makes up for the sysadmin woes. Just to get on the same page, I'm curious what are the specific contents of the EnvironmentFile you have specified, and if there is a default that you are using? Thanks, Brandon These files should be considered as templates just like the examples of slurm.conf and release_agent in the etc directory. In my testing I used a defaults file like this: david@phobos ~/cluster/1411/linux/etc>cat defaults SLURMCTLD_OPTIONS=-vvv SLURMD_OPTIONS=-vvv setting the command like options for the daemons. David Please reopen this bug because it hasn't been resolved at all! I have confirmed that it's still present in SLURM 14.11. From the systemd bug: "Hmm, listing two pidile entries in the headers is an extension that is not supported by systemd, sorry, and it's unlikely to be supported. your daemon really shouldn't ship thing with an extension like that..." On systemd systems you really need to set up the service files or use my pidfile workaround for Slurm to work correctly. How about removing the two pidfile, and processname, tag lines in the slurm init.d script? brian@compy:~/slurm/14.11/slurm/etc$ git diff diff --git a/etc/init.d.slurm.in b/etc/init.d.slurm.in index 5387a9e..4741ccb 100644 --- a/etc/init.d.slurm.in +++ b/etc/init.d.slurm.in @@ -5,12 +5,6 @@ # manages exclusive access to a set of compute \ # resources and distributes work to those resources. # -# processname: @sbindir@/slurmd -# pidfile: /var/run/slurmd.pid -# -# processname: @sbindir@/slurmctld -# pidfile: /var/run/slurmctld.pid -# # config: /etc/sysconfig/slurm # ### BEGIN INIT INFO This makes the init script work for me in centos 7 and centos 6. The init script can grab the corresponding pids independent of the pidfile tag line. The status() function greps out the Slurm[ctl]dPid file from the slurm.conf and matches the pid in the file against the pid of the process with the given daemon name. And stop()/killproc() kills the daemon by using the daemon name to get the pid. It looks, and feels, safe to do. Can anyone see any adverse effects to removing the pidfile and processname tag lines? Hello The bug is still in 14.11.8 and Centos7.1 As I read the systemd developer refuses to fix this. The workaround for now is to remove the Pid-file line if the corresponding service won't be started. But this means that we have different config files for different nodes. It would be nice if the service could be split up in two unit files Sven Hi did you try the scripts in comment 12? David (In reply to David Bigagli from comment #19) > Hi did you try the scripts in comment 12? > > David I've missed that the files are already installed. And indeed this looks much better. The only small issues: - it fails if the defaults file are not in place - the location of the pid file is not synced with the definition in /etc/slurm/slurm.conf but this is easly adjustable, so for me it is fixed. For the developer it would be nice if they could clean up the situation many thanks! sven Hi, these issues are actually fixed as well. If you look in the source code in the etc directory you will find the .in files, e.g. slurmctld.service.in. During the configuration phase before the build these template files are filled with the configure options. The examples in comment 12 are after the software was configured. David Created attachment 2173 [details]
Fix for described problem
Comment on attachment 2173 [details] Fix for described problem As I was asked to confirm this bug I added my own screen captures in attachment which support Comment #20. Yes this is a known issue the location of the pid file must be a valid one. The default directory in slurm.conf is /var/run if the path changes then the startup file must be updated accordingly otherwise systemd does not work properly. David |