Ticket 13280 - getaddrinfo fails even if network-online.target is complete
Summary: getaddrinfo fails even if network-online.target is complete
Status: OPEN
Alias: None
Product: Slurm
Classification: Unclassified
Component: slurmctld (show other tickets)
Version: 21.08.5
Hardware: Linux Linux
: --- C - Contributions
Assignee: Tim Wickberg
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2022-01-27 14:33 MST by Gennaro Oliva
Modified: 2022-01-28 08:12 MST (History)
0 users

See Also:
Site: -Other-
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments
patch (1.19 KB, text/plain)
2022-01-27 14:33 MST, Gennaro Oliva
Details

Note You need to log in before you can comment on or make changes to this ticket.
Description Gennaro Oliva 2022-01-27 14:33:21 MST
Created attachment 23175 [details]
patch

On Debian systems, when using ifupdown for network configuration with allow-hotplug interfaces (which the installer uses by default) and both slurmd and slurmctld are installed, they fail to start. This happens because allow-hotplug do not guarantee that getaddrinfo succeed after network-online.target is complete [2].
The patch attached retry getaddrinfo five times before giving up and exiting SLURM deamons.

[1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=%23984928
[2] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=868650