Bug 1050 - SlurmDBD cluster name setting
Summary: SlurmDBD cluster name setting
Status: RESOLVED INFOGIVEN
Alias: None
Product: Slurm
Classification: Unclassified
Component: slurmdbd (show other bugs)
Version: 2.6.2
Hardware: Linux Linux
: --- 3 - Medium Impact
Assignee: David Bigagli
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2014-08-18 22:26 MDT by toru matsuoka
Modified: 2014-08-20 06:03 MDT (History)
1 user (show)

See Also:
Site: CRAY
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Linux Distro: ---
Machine Name: CRAY CS300
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description toru matsuoka 2014-08-18 22:26:42 MDT
Although I am building the SLURM environment, the following messages are displayed and I am in trouble. 

Although Slurmdbd has started, it seems for a cluster name is not set up well. 

How should it set up? 

Please teach me about this contents.

・slurmdbd.log)

[2014-08-19T19:05:42.910] auth plugin for Munge (http://code.google.com/p/munge/) loaded
[2014-08-19T19:05:42.959] Accounting storage MYSQL plugin loaded
[2014-08-19T19:05:42.959] pidfile not locked, assuming no running daemon
[2014-08-19T19:05:42.962] slurmdbd version 2.6.2 started
[2014-08-19T19:10:07.597] DBD_STEP_START: cluster not registered
[2014-08-19T19:10:07.681] error: Processing last message from connection 7(10.10.0.1) uid(0)
[2014-08-19T19:10:08.515] DBD_JOB_START: cluster not registered
[2014-08-19T19:10:10.869] DBD_STEP_START: cluster not registered
[2014-08-19T19:10:13.557] DBD_JOB_START: cluster not registered
[2014-08-19T19:10:13.558] DBD_JOB_START: cluster not registered
[2014-08-19T19:10:18.598] DBD_JOB_START: cluster not registered
[2014-08-19T19:10:21.509] error: Problem getting jobs for cluster cluster
[2014-08-19T19:10:21.856] DBD_CLUSTER_CPUS: cluster not registered
[2014-08-19T19:10:21.857] error: Processing last message from connection 7(10.10.0.1) uid(0)
[2014-08-19T19:11:07.625] DBD_STEP_COMPLETE: cluster not registered
[2014-08-19T19:11:07.666] DBD_JOB_START: cluster not registered
[2014-08-19T19:11:07.667] DBD_JOB_COMPLETE: cluster not registered
[2014-08-19T19:11:07.668] DBD_JOB_START: cluster not registered
[2014-08-19T19:11:07.668] DBD_STEP_START: cluster not registered
[2014-08-19T19:11:10.919] DBD_STEP_COMPLETE: cluster not registered
[2014-08-19T19:11:10.959] DBD_JOB_START: cluster not registered
[2014-08-19T19:11:10.960] DBD_JOB_COMPLETE: cluster not registered
[2014-08-19T19:11:10.961] DBD_JOB_START: cluster not registered
[2014-08-19T19:11:10.961] DBD_STEP_START: cluster not registered
[2014-08-19T19:12:07.680] DBD_STEP_COMPLETE: cluster not registered
[2014-08-19T19:12:07.720] DBD_JOB_START: cluster not registered
[2014-08-19T19:12:07.721] DBD_JOB_COMPLETE: cluster not registered
[2014-08-19T19:12:10.972] DBD_STEP_COMPLETE: cluster not registered
[2014-08-19T19:12:11.012] DBD_JOB_START: cluster not registered
[2014-08-19T19:12:11.013] DBD_JOB_COMPLETE: cluster not registered
[2014-08-19T19:12:51.928] Terminate signal (SIGINT or SIGTERM) received
[2014-08-19T19:12:52.529] Unable to remove pidfile '/var/run/slurmdbd.pid': Permission denied
[2014-08-19T19:13:02.114] auth plugin for Munge (http://code.google.com/p/munge/) loaded
[2014-08-19T19:13:02.164] Accounting storage MYSQL plugin loaded
[2014-08-19T19:13:02.164] pidfile not locked, assuming no running daemon
[2014-08-19T19:13:02.168] slurmdbd version 2.6.2 started
[2014-08-19T19:13:13.663] Terminate signal (SIGINT or SIGTERM) received
[2014-08-19T19:13:13.663] Unable to remove pidfile '/var/run/slurmdbd.pid': Permission denied
[2014-08-19T19:13:23.847] auth plugin for Munge (http://code.google.com/p/munge/) loaded
[2014-08-19T19:13:23.896] Accounting storage MYSQL plugin loaded
[2014-08-19T19:13:23.896] pidfile not locked, assuming no running daemon
[2014-08-19T19:13:23.899] slurmdbd version 2.6.2 started
[2014-08-19T19:14:12.077] DBD_STEP_START: cluster not registered
[2014-08-19T19:14:12.158] error: Processing last message from connection 7(10.10.0.1) uid(0)
[2014-08-19T19:14:13.646] DBD_JOB_START: cluster not registered
[2014-08-19T19:14:14.055] DBD_STEP_START: cluster not registered
[2014-08-19T19:14:18.688] DBD_JOB_START: cluster not registered
[2014-08-19T19:14:18.689] DBD_JOB_START: cluster not registered
[2014-08-19T19:14:18.690] DBD_JOB_START: cluster not registered
[2014-08-19T19:14:18.690] DBD_JOB_START: cluster not registered
[2014-08-19T19:14:23.731] DBD_JOB_START: cluster not registered
Comment 1 Moe Jette 2014-08-19 03:48:50 MDT
The most common configuration is to have one slurmdbd and one database at a site into which all computers (clusters) write accounting records and have users, accounts, limits, etc. defined. Each cluster must be defined in the (one) database and in the slurm.conf file on each cluster.

Read:
http://slurm.schedmd.com/accounting.html
Especially the example:
"sacctmgr add cluster snowflake"

Also see the slurm.conf man page here:
http://slurm.schedmd.com/slurm.conf.html
Make sure the "ClusterName" parameter is set.
Comment 2 Danny Auble 2014-08-20 06:03:07 MDT
If the Slurm documentation doesn't explain how to fix this issue please reopen.