Ticket 5783 - Config check for deamon
Summary: Config check for deamon
Status: RESOLVED DUPLICATE of ticket 3445
Alias: None
Product: Slurm
Classification: Unclassified
Component: slurmctld (show other tickets)
Version: 17.11.7
Hardware: Linux Linux
: --- 4 - Minor Issue
Assignee: Tim Wickberg
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2018-09-27 09:19 MDT by Justin Lecher
Modified: 2018-10-10 09:34 MDT (History)
0 users

See Also:
Site: AstraZeneca
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description Justin Lecher 2018-09-27 09:19:27 MDT
Hi team

We just had an incident due to a config issue. We have been rolling out an updated slurm config to all our infrastructure and reloaded the daemons. due to a syntax error we lost the config on all nodes and slurm was unusable. while the normal compute continued to run, all VDI session died which was a major impact.

would it be possible to add a config check functionality like apachectl configtest or sshd -t to slurm*d? this could be run for validation before actually reloading the daemons.

thanks
Justin
Comment 1 Justin Lecher 2018-10-10 01:56:29 MDT
Hi team

Any news on this?

Thanks
Justin
Comment 2 Tim Wickberg 2018-10-10 09:34:42 MDT
Unfortunately, with how some of our configuration validation is delegated off into the plugins this isn't as simple as it would seem. 

We've discussed this, and have had an internal enhancement for this for a while. I'm closing this as a duplicate of the (now-public) bug 3445 which is tracking this issue, but I cannot commit to when or if we'll have this completed at this time.

- Tim

*** This ticket has been marked as a duplicate of ticket 3445 ***