Locking on GPFS can be slow - I sometimes get this: -bash-4.1$ salloc --x11 -N1 -n 20 -t 30 --exclusive -p dav --account=sssg0001 srun -N1 --pty /bin/bash -l salloc: error: run_command: xauth poll timeout @ 100 msec salloc: error: x11_get_xauth: Could not retrieve magic cookie. Cannot use X11 forwarding. Shouldn't this timeout be configurable? Or at least 10s of seconds?
I believe you know where to patch the timeout if required? I'm moving this into an enhancement request. I should probably have added an X11Parameters configuration option to give us a place to change these default values, but that will need to wait until 18.08 at this point.
I found that this error is raised also when you have some stale locks on .Xauthority* files in your directory. The workaround to this is to remove the stale locks using xauth '-b' option, or to remove directly these files. [slurm@moll0 ~]$ ls .Xauthority* -lah -rw------- 1 slurm slurm 0 7 des 17:08 .Xauthority -rw------- 1 slurm slurm 0 7 des 17:10 .Xauthority-c -rw------- 1 slurm slurm 0 7 des 17:01 .Xauthority-l To detect if this is the problem an 'strace xauth' would show multiple EEXIST errors like that one: open("/nfs/home/slurm/.Xauthority-c", O_WRONLY|O_CREAT|O_EXCL, 0600) = -1 EEXIST (File exists)
I'm seeing this today, although a week ago everything was working fine. [griznog@smsx10srw-srcf-d15-37 ~]$ srun --pty --x11 --time=1:00:00 xterm srun: error: run_command: xauth poll timeout @ 100 msec srun: error: x11_get_xauth: Could not retrieve magic cookie. Cannot use X11 forwarding. We have $HOME on GPFS so I tried increasing the timeout, but even at 10 seconds I still get the same error. There doesn't seem to be an issue with .Xauthority and I can 'ssh -Y' to a node and X forwarding back works normally. Any other suggestions on how to get this to work again? I'm on 17.11.02.
Hey folks - I'm tagging this as a duplicate of the X11 catch-all bug 3647. As mentioned on there, 18.08 will have an X11Parameters option that gives us a place to add settings to change these timers. Also mentioned on there, I may add support for creating separate XAUTHORITY environment variables/files on the compute nodes, which should reduce contention on various filesystem for locking around ~/.Xauthority. *** This ticket has been marked as a duplicate of ticket 3647 ***
This would be very helpful. > > Also mentioned on there, I may add support for creating separate XAUTHORITY > environment variables/files on the compute nodes, which should reduce > contention on various filesystem for locking around ~/.Xauthority. > > *** This bug has been marked as a duplicate of bug 3647 ***