Ticket 12665 - scontrol write config - output not valid slurm.conf syntax
Summary: scontrol write config - output not valid slurm.conf syntax
Status: RESOLVED INVALID
Alias: None
Product: Slurm
Classification: Unclassified
Component: Configuration (show other tickets)
Version: 20.11.8
Hardware: Linux Linux
: --- 6 - No support contract
Assignee: Jacob Jenson
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2021-10-13 14:00 MDT by Michael Hammond
Modified: 2021-10-13 14:01 MDT (History)
1 user (show)

See Also:
Site: -Other-
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description Michael Hammond 2021-10-13 14:00:38 MDT
Overview:
scontrol write config does not always produce output which can be used as a valid configuration file.  Settings SlurmctldHost, CpuFreqDef, SlurmctldSyslogDebug, and  SlurmSyslogDebug were found, but may not be all instances.

Case 1:
For SlurmctldHost, syntax in slurm.conf is:
SlurmctldHost={hostname1}{optional ({ip addr1}) }
SlurmctldHost={hostname2}{optional ({ip addr2}) }

Multiples are allowed and order is significant.

scontrol write config produces:
SlurmctldHost[1]={hostname1}({ip addr1} )
SlurmctldHost[2]={hostname2}({ip addr2} )

Case 2:
If CpuFreqDef, SlurmctldSyslogDebug, or SlurmSyslogDebug are undefined, they are printed in the output of scontrol write config as:

CpuFreqDef=Unknown
SlurmctldSyslogDebug=unknown
SlurmdSyslogDebug=unknown

None of these are syntax accepted by slurmd

Steps to reproduce Case 1:
1. Create slurm.conf with multiple Slurmctld hosts (snippet below)
SlurmctldHost=primary.cluster(192.168.7.1)
SlurmctldHost=secondary.cluster(192.168.7.2)

2. start slurmctld with this config

3. run "scontrol write config"
Output will contain:
SlurmctldHost[0]=primary.cluster(192.168.7.1)
SlurmctldHost[1]=secondary.cluster(192.168.7.2)

4. cp output from step 3 to slurm.conf

5. run slurmd or slurmctld with new slurm.conf

Alternate reproduction steps:
1, 2, 3 As steps 1,2,3 above
4. SLURM_CONF=slurm.conf-{DATE} scontrol show config
root@primary:/etc/slurm# SLURM_CONF=slurm.conf-update-20211012 scontrol show config
scontrol: error: Parse error in file slurm.conf-update-20211012 line 219: "SlurmctldHost[0]=primary.cluster(192.168.7.1)"
scontrol: error: Parse error in file slurm.conf-update-20211012 line 221: "SlurmctldHost[1]=secondary.cluster(192.168.7.2))"
scontrol: error: No SlurmctldHost defined.
scontrol: fatal: Unable to process configuration file

Case 2 reproduction steps:

1. Remove any definitions of CpuFreqDef, SlurmctldSyslogDebug, and  SlurmSyslogDebug in slurm.conf.

2. Rerun SLURM_CONF=slurm.conf-update-20211012 scontrol show config
scontrol: error: cpu_freq_verify_def: CpuFreqDef=Unknown invalid
scontrol: error: Ignoring invalid CpuFreqDef: Unknown
scontrol: error: Invalid SlurmctldSyslogDebug unknown
scontrol: fatal: Unable to process configuration file