Hi, it seems that when one runs slurmd in the foreground (-D command-line option), quite a lot of the initial diagnostics output from slurmstepd is lost. Later on the expected output starts to appear, so I suspect the issue is that the I/O redirection for stdout/stderr isn't set up early enough, and before that is done the output disappears into some black hole? This happens both for the built-in diagnostics output, and for "printf debugging" I have added myself. syslog() calls I have added myself do appear in the system log, so it's not a question of the functions I'm investigating not being called at all. FWIW, I have noticed this issue with src/common/cpu_frequency.c:{cpu_freq_cgroup_validate,cpu_freq_set}, although there are surely others as well. By the time the code gets to calling cpu_freq_reset(), the I/O redirection has been made and the output appears.
Hi, We just got hit by this one on our TARS cluster running 16.05.5 ... All slurm daemons are managed by supervisord (http://supervisord.org/) which wants all tools to be launched in foreground mode : [...] Programs meant to be run under supervisor should not daemonize themselves. Instead, they should run in the foreground. They should not detach from the terminal from which they are started. [...] And the only way to allow this behavior for slurm daemons is to start them in debug mode (-D). Unfortunately this alters logging too ... sending messages to stdout/stderr instead of syslog. It would be great to have an option (-F) to launch slurm daemons in foreground mode without altering the logging. Having slurmstepd logs when daemons run in debug mode would be good too ... Thanks.
Hey Nick, yeah, for some reason if you don't set the slurmdlogfile it doesn't get to syslog in Daemon mode. I am guessing that is by design as most people running in daemon mode are doing debugging and perhaps don't want to see the debug in syslog. I have verified if you set slurmdlogfile=/var/log/syslog it does show up there though. Perhaps we could have an option that makes this happen by default when it is not set (hence changing this to a sev 5). In any case I hope the work around is sufficient for now. Having the slurmstepd log out of the stdout of the slurmd is a different story all together as it is a much harder issue since we have to grab the output of a separate process and add it to the slurmd's.
(In reply to Danny Auble from comment #2) > Hey Nick, yeah, for some reason if you don't set the slurmdlogfile it > doesn't get to syslog in Daemon mode. I am guessing that is by design as > most people running in daemon mode are doing debugging and perhaps don't > want to see the debug in syslog. > > I have verified if you set slurmdlogfile=/var/log/syslog it does show up > there though. Perhaps we could have an option that makes this happen by > default when it is not set (hence changing this to a sev 5). > > In any case I hope the work around is sufficient for now. We are aggregating all slurmd logs in a central location for our 180 compute nodes with syslog. Now, we are facing some slurmstepd crashes (PR to be filled) ... and wanted to have the corresponding logs for the report. But this problem do not occur with high frequency, and logging locally will introduce some annoyance. Will see what we can do there. > Having the slurmstepd log out of the stdout of the slurmd is a different > story all together as it is a much harder issue since we have to grab the > output of a separate process and add it to the slurmd's. Let's forget this way, then. Thanks.