Bug 4393 - Recommended logrotate config no longer correct
Summary: Recommended logrotate config no longer correct
Status: RESOLVED FIXED
Alias: None
Product: Slurm
Classification: Unclassified
Component: Documentation (show other bugs)
Version: 17.02.9
Hardware: Linux Linux
: --- 4 - Minor Issue
Assignee: Felip Moll
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2017-11-16 13:14 MST by David Gloe
Modified: 2017-11-21 13:31 MST (History)
0 users

See Also:
Site: CRAY
Alineos Sites: ---
Bull/Atos Sites: ---
Confidential Site: ---
Cray Sites: Cray Internal
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
OCF Sites: ---
SFW Sites: ---
SNIC sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed: 17.11
Target Release: ---
DevPrio: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description David Gloe 2017-11-16 13:14:37 MST
The logrotate configuration mentioned in man slurm.conf (https://slurm.schedmd.com/slurm.conf.html) no longer works correctly as of 17.02. It mentions using /etc/init.d/slurm reconfig, but /etc/init.d/slurm is no longer installed.

Please update the documentation with command(s) which will cause slurmctld/slurmd to reload their log files.
Comment 4 Felip Moll 2017-11-21 04:23:50 MST
(In reply to David Gloe from comment #0)
> The logrotate configuration mentioned in man slurm.conf
> (https://slurm.schedmd.com/slurm.conf.html) no longer works correctly as of
> 17.02. It mentions using /etc/init.d/slurm reconfig, but /etc/init.d/slurm
> is no longer installed.
> 
> Please update the documentation with command(s) which will cause
> slurmctld/slurmd to reload their log files.

Hi David,

I changed the documentation accordingly. You can find it in the commit b7cec940facc2b7631411bf38c8d7b7b271c729d. It will be available on 17.11.0 release and up. Documentation in the webpage will be refreshed when 17.11 is released in about a week.

Thanks for reporting!,
Felip M
Comment 5 David Gloe 2017-11-21 11:57:41 MST
slurmctld isn't reloading the log file when I send it SIGUSR2. Is this a separate bug?

opal-p2:/home/users/dgloe # mv /var/spool/slurm/slurmctld.log /var/spool/slurm/slurmctld.log.rotated
opal-p2:/home/users/dgloe # killall -SIGUSR2 slurmctld
opal-p2:/home/users/dgloe # ls /var/spool/slurm/slurmctld.log
ls: cannot access '/var/spool/slurm/slurmctld.log': No such file or directory
opal-p2:/home/users/dgloe # tail /var/spool/slurm/slurmctld.log.rotated

[2017-11-21T12:55:54.029] error: _bb_get_pools: json parser failed on DataWarp REST API error: /opt/cray/dws/default/bin/dwgateway exited 1: dwgateway: Gateway retrieval failed

[2017-11-21T12:55:54.030] error: _load_state: failed to find DataWarp entries, what now?
[2017-11-21T12:56:06.186] burst_buffer/cray: bb_p_job_try_stage_in
[2017-11-21T12:56:24.168] error: _bb_get_pools: pools status:256 response:DataWarp REST API error: /opt/cray/dws/default/bin/dwgateway exited 1: dwgateway: Gateway retrieval failed

[2017-11-21T12:56:24.168] error: _bb_get_pools: json parser failed on DataWarp REST API error: /opt/cray/dws/default/bin/dwgateway exited 1: dwgateway: Gateway retrieval failed

[2017-11-21T12:56:24.169] error: _load_state: failed to find DataWarp entries, what now?
Comment 6 Felip Moll 2017-11-21 13:31:21 MST
(In reply to David Gloe from comment #5)
> slurmctld isn't reloading the log file when I send it SIGUSR2. Is this a
> separate bug?
> 
> opal-p2:/home/users/dgloe # mv /var/spool/slurm/slurmctld.log
> /var/spool/slurm/slurmctld.log.rotated
> opal-p2:/home/users/dgloe # killall -SIGUSR2 slurmctld
> opal-p2:/home/users/dgloe # ls /var/spool/slurm/slurmctld.log
> ls: cannot access '/var/spool/slurm/slurmctld.log': No such file or directory
> opal-p2:/home/users/dgloe # tail /var/spool/slurm/slurmctld.log.rotated
> 
> [2017-11-21T12:55:54.029] error: _bb_get_pools: json parser failed on
> DataWarp REST API error: /opt/cray/dws/default/bin/dwgateway exited 1:
> dwgateway: Gateway retrieval failed
> 
> [2017-11-21T12:55:54.030] error: _load_state: failed to find DataWarp
> entries, what now?
> [2017-11-21T12:56:06.186] burst_buffer/cray: bb_p_job_try_stage_in
> [2017-11-21T12:56:24.168] error: _bb_get_pools: pools status:256
> response:DataWarp REST API error: /opt/cray/dws/default/bin/dwgateway exited
> 1: dwgateway: Gateway retrieval failed
> 
> [2017-11-21T12:56:24.168] error: _bb_get_pools: json parser failed on
> DataWarp REST API error: /opt/cray/dws/default/bin/dwgateway exited 1:
> dwgateway: Gateway retrieval failed
> 
> [2017-11-21T12:56:24.169] error: _load_state: failed to find DataWarp
> entries, what now?

Sorry, I forgot to comment that for versions previous to 17.11 the signal to be sent has to be SIGHUP instead of SIGUSR2.

Regards,
Felip M