Summary: | salloc --x11 followed by executing, say, xeyes, fails if job is submitted from the control node | ||
---|---|---|---|
Product: | Slurm | Reporter: | selva.nair |
Component: | slurmstepd | Assignee: | Jacob Jenson <jacob> |
Status: | RESOLVED INVALID | QA Contact: | |
Severity: | 6 - No support contract | ||
Priority: | --- | ||
Version: | 21.08.5 | ||
Hardware: | Linux | ||
OS: | Linux | ||
Site: | -Other- | Alineos Sites: | --- |
Atos/Eviden Sites: | --- | Confidential Site: | --- |
Coreweave sites: | --- | Cray Sites: | --- |
DS9 clusters: | --- | HPCnow Sites: | --- |
HPE Sites: | --- | IBM Sites: | --- |
NOAA SIte: | --- | OCF Sites: | --- |
Recursion Pharma Sites: | --- | SFW Sites: | --- |
SNIC sites: | --- | Linux Distro: | --- |
Machine Name: | CLE Version: | ||
Version Fixed: | Target Release: | --- | |
DevPrio: | --- | Emory-Cloud Sites: | --- |
Description
selva.nair
2022-05-21 23:20:54 MDT
Solved: all nodes had a line in /etc/hosts as 127.0.1.1 <nodename> Ubuntu adds this when the hostname is set. Removing this on the controller fixes the issue. I guess this entry makes the controller provide 127.0.1.1 as the IP of the starting host to slurmstepd. Not sure why only x11 forwarding is affected. |