Ticket 2584 - x11 functionality in slurm
Summary: x11 functionality in slurm
Status: RESOLVED INFOGIVEN
Alias: None
Product: Slurm
Classification: Unclassified
Component: Scheduling (show other tickets)
Version: 15.08.4
Hardware: Linux Linux
: 3 - Medium Impact
Assignee: Alejandro Sanchez
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2016-03-24 03:33 MDT by Simran
Modified: 2016-04-13 20:38 MDT (History)
1 user (show)

See Also:
Site: Genentech (Roche)
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description Simran 2016-03-24 03:33:06 MDT
Hi Guys,

We have a requirement to support x11 with slurm and I was wondering what are options would be.  Currently we have our $DISPLAY set to localhost due to the following entry in sshd_config:

X11UseLocalhost yes (default)

we do enable X11 forwarding:

X11Forwarding yes

However with this setup we are unable to use x11 on compute nodes.  One alternative we found is the slurm-spank-x11 plugin but that has not been worked on for a few years now and even though it works we have a concern with using a 3rd party plugin that doesn't seem to be updated anymore, and might break with future slurm upgrades.  Would this be supported by schedmd if we do run into any issues in the future with this plugin and slurm upgrades?  Thought I would ask.

If it would not be supported then are there any other methods besides setting:

X11UseLocalhost no

in our sshd_config in order to get x11 working.  Wanted to get your expert advice before moving forward with a solution which will be used in a production environment.

Thanks for your help with this!

Regards,
-Simran
Comment 4 Alejandro Sanchez 2016-03-28 08:36:06 MDT
Hi Simran. Slurm doesn't natively have x11 capabilities, so a third party plugin is needed. Matthieu's plugin is one of the options, and from our experience it works although the last commit is 2 years old. There are sites which develop their own solution but this one is public and probably the most popular one. I myself tried it like a year ago and it worked well. Anyhow, none of these solutions are legitimately supported by SchedMD, but if you encounter any issue with it you can directly talk with Matthieu or let us know so we can report it to him. I personally don't know any alternative to X11UseLocalhost concern. Please, let us know if you have any other questions related to this.
Comment 5 Alejandro Sanchez 2016-04-07 19:30:38 MDT
Simran, another customer had a similar issue and they managed to make it work with X11UseLocalhost set to yes (default) and this bug came to my mind.

In both slurm-spank-x11-plug.c:line43 and slurm-spank-x11.c:line40 there's a:

#ifndef X11_LIBEXEC_PROG
#define X11_LIBEXEC_PROG            "/usr/libexec/slurm-spank-x11"
#endif

In the spec file slurm-spank-x11.spec we see:

%build
%{__cc} -g -o slurm-spank-x11 slurm-spank-x11.c
%{__cc} -g -shared -fPIC -o x11.so \
	-D"X11_LIBEXEC_PROG=\"%{_libexecdir}/%{name}\"" \
	slurm-spank-x11-plug.c

so the value for the defined symbol X11_LIBEXEC_PROG can be configured through libexecdir[1].

Please, let us know if you managed to make it work or not. Thank you.

[1] http://www.rpm.org/wiki/PackagerDocs/Macros#MacroAnaloguesofAutoconfVariables
Comment 6 Alejandro Sanchez 2016-04-12 21:50:34 MDT
Hi Simran, any updates on this? Did you manage to make it work? Thanks.
Comment 7 Simran 2016-04-12 22:47:10 MDT
Hi Alejandro,

Unfortunately looks like the only supported way to achieve x11 functionality via slurm is if we set "X11UseLocalhost no" in our sshd config.  Unless you know of another slurm supported method of achieving this we can close this bug.

Regards,
-Simran
Comment 8 Alejandro Sanchez 2016-04-13 00:29:20 MDT
Simran, yesterday I myself built again the slurm-spank-x11 plugin on a test cluster to check its functionality and I made it work using the default value for X11UseLocalhost (which defaults to yes). So setting "X11UseLocalhost no" is not a requirement for the plugin to work. As more and more people are using this plugin, we're working on documenting a standardized how-to procedure on how to build and configure it, but this documentation is not ready yet. Anyhow, see below the steps I followed to build and configure it, maybe this will be useful for you as well:

# Maybe obvious, but don't forget the -X on ssh
$ ssh -X alex@testserver.com

# Get the plugin
alex@testserver:~$ mkdir git
alex@testserver:~$ cd git
alex@testserver:~/git$ git clone https://github.com/hautreux/slurm-spank-x11.git
alex@testserver:~/git$ cd slurm-spank-x11

# Manually edit the X11_LIBEXEC_PROG macro definition
alex@testserver:~/git/slurm-spank-x11$ vi slurm-spank-x11.c
alex@testserver:~/git/slurm-spank-x11$ vi slurm-spank-x11-plug.c
alex@testserver:~/git/slurm-spank-x11$ grep "define X11_" slurm-spank-x11.c
#define X11_LIBEXEC_PROG "/home/alex/slurm/15.08/testserver/libexec/slurm-spank-x11"
alex@testserver:~/git/slurm-spank-x11$ grep "define X11_LIBEXEC_PROG" slurm-spank-x11-plug.c
#define X11_LIBEXEC_PROG "/home/alex/slurm/15.08/testserver/libexec/slurm-spank-x11"
alex@testserver:~/git/slurm-spank-x11$

# Compile
alex@smd-server:~/git/slurm-spank-x11$ gcc -g -o slurm-spank-x11 slurm-spank-x11.c
alex@smd-server:~/git/slurm-spank-x11$ gcc -g -I/home/alex/slurm/15.08/testserver/include -shared -fPIC -o x11.so slurm-spank-x11-plug.c

# Install
alex@testserver:~/git/slurm-spank-x11$ mkdir -p /home/alex/slurm/15.08/testserver/libexec
alex@testserver:~/git/slurm-spank-x11$ install -m 755 slurm-spank-x11 /home/alex/slurm/15.08/testserver/libexec
alex@testserver:~/git/slurm-spank-x11$ install -m 755 x11.so /home/alex/slurm/15.08/testserver/lib/slurm

# Configure
alex@testserver:~/git/slurm-spank-x11$ echo -e "optional\tx11.so" >> /home/alex/slurm/15.08/testserver/etc/plugstack.conf
alex@testserver:~/git/slurm-spank-x11$ cd ~/tests

# Run
alex@testserver:~/tests$ srun -n1 --pty --x11 xclock
alex@node1's password:
alex@testserver:~/tests$

I've not yet had time to test a more elegant way to make the plugin work. But probably with ~/.rpmmacros specifying %{_libexecdir} and %{_libdir} the build process can be automated with rpm or rpmbuild. We're working on this.

Another concern is that the spec file gcc command to compile slurm-spank-x11-plug.c doesn't use the parameter to specify the include path:
-I/home/alex/slurm/15.08/testserver/include

Please, try to rebuild/configure the plugin in someway similar to this procedure and let me know whether it works. Again, for me it works with X11UseLocalhost not defined, so with the default 'yes' value.
Comment 9 Simran 2016-04-13 00:36:39 MDT
Hi Alejandro,

Sorry, I should have been a bit more clear.  I have already tried slurm-spank-x11 and it works fine.  However, the slurm-spank-x11 plugin has not been worked on for a few years now and even though it works we have a concern with using a 3rd party plugin that doesn't seem to be updated anymore, and might break with future slurm upgrades.  From your previous response:

--
Anyhow, none of these solutions are legitimately supported by SchedMD
--

This makes us not want to use this plugin if it won't be supported by schedmd.  If this plugin was part of slurm or schedmd would support it then we would feel comfortable deploying it in production.

Thanks,
-Simran
Comment 10 Alejandro Sanchez 2016-04-13 00:41:25 MDT
Well, yesterday we were discussing adding this to contribs for version 16.05 with an internal tunnelling mechanism for the next major release. But for now it is a 3rd party plugin and effectively currently this isn't legitimately supported by SchedMD.
Comment 11 Alejandro Sanchez 2016-04-13 00:46:20 MDT
Note also that it not being updated doesn't mean it lacks functionality. If it works there's no necessary need for updates :) Anyhow I understand not seeing recent updates can cause this "abandoned project sensation".
Comment 12 Simran 2016-04-13 00:52:15 MDT
Thanks Alejandro.  We would definitely be interested in getting this added into slurm and would use this functionality if supported by schedmd :)
Comment 13 Alejandro Sanchez 2016-04-13 20:24:45 MDT
Great. Is it fine to close this bug? Thanks.
Comment 14 Simran 2016-04-13 20:37:36 MDT
Yes, please feel free to close this bug.

Thanks,
-Simran
Comment 15 Alejandro Sanchez 2016-04-13 20:38:57 MDT
All right. Marking as resolved.