Bug 1943 - slurmdbd not created the required tables, slurmctld fails to start
Summary: slurmdbd not created the required tables, slurmctld fails to start
Status: RESOLVED FIXED
Alias: None
Product: Slurm
Classification: Unclassified
Component: slurmdbd (show other bugs)
Version: 15.08.0
Hardware: Linux Linux
: --- 6 - No support contract
Assignee: Jacob Jenson
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2015-09-14 23:49 MDT by tribelkhindi
Modified: 2015-10-23 01:44 MDT (History)
3 users (show)

See Also:
Site: -Other-
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed: 15.08
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description tribelkhindi 2015-09-14 23:49:29 MDT
setup used :
Slurm version 15.08.0
RHEL 6.5
mysql  Ver 14.14 Distrib 5.1.71, for redhat-linux-gnu (x86_64) using readline 5.1

Problem Desc:
Slurmctld daemon fails to start with the mysql database configuration. 

The problem is observed when we are newly installing the slurm v15.08.0. 
If we have any existing setup of slurm v14 and when we upgrade to v15, it will update the database tables correctly.



1. Slurmdbd starts without any issues - 

[2015-09-15T14:35:27.739] debug3: Trying to load plugin /apps/slurm/slurm_v15.08.0/lib/slurm/accounting_storage_mysql.so
[2015-09-15T14:35:27.743] debug2: mysql_connect() called for db slurm_acct_db
[2015-09-15T14:35:27.801] debug4: Table cluster_table doesn't exist, adding
[2015-09-15T14:35:27.817] debug4: Table txn_table doesn't exist, adding
[2015-09-15T14:35:27.844] debug4: Table tres_table doesn't exist, adding
[2015-09-15T14:35:27.862] debug4: Table acct_coord_table doesn't exist, adding
[2015-09-15T14:35:27.887] debug4: Table acct_table doesn't exist, adding
[2015-09-15T14:35:27.904] debug4: Table res_table doesn't exist, adding
[2015-09-15T14:35:27.920] debug4: Table clus_res_table doesn't exist, adding
[2015-09-15T14:35:27.941] debug4: Table qos_table doesn't exist, adding
[2015-09-15T14:35:27.994] debug4: Table user_table doesn't exist, adding
[2015-09-15T14:35:28.020] Accounting storage MYSQL plugin loaded
[2015-09-15T14:35:28.021] debug3: Success.

[root@masternode-2 ~]# ps -ef | grep slurmdbd
root      9824  1552  0 14:35 pts/1    00:00:00 ./slurmdbd -Dvvvv

2. sacctmgr fails with the below errors:

[root@masternode-2 ~]# /apps/slurm/slurm_v15.08.0/bin/sacctmgr show config
sacctmgr: error: issue converting tables
sacctmgr: error: Couldn't load specified plugin name for accounting_storage/mysql: Plugin init() callback failed
sacctmgr: error: cannot create accounting_storage context for accounting_storage/mysql
Configuration data as of 2015-09-15T14:35:53
AccountingStorageBackupHost  = (null)
AccountingStorageHost  = masternode-2
AccountingStorageLoc   = slurm_acct_db
AccountingStoragePass  = password
AccountingStoragePort  = 3306
AccountingStorageType  = accounting_storage/mysql
AccountingStorageUser  = root
AuthType               = auth/munge
MessageTimeout         = 10 sec
PluginDir              = /apps/slurm/slurm_v15.08.0/lib/slurm
PrivateData            = none
SlurmUserId            = root(0)
SLURM_CONF             = /apps/slurm/slurm_v15.08.0/etc/slurm.conf
SLURM_VERSION          = 15.08.0
TrackWCKey             = 0
sacctmgr: error: Problem talking to the database: Interrupted system call

3. slurmctld fails to start

slurmctld: debug3: Trying to load plugin /apps/slurm/slurm_v15.08.0/lib/slurm/accounting_storage_mysql.so
slurmctld: debug2: mysql_connect() called for db slurm_acct_db
slurmctld: debug4: (as_mysql_convert.c:771) query
show columns from "linux_assoc_table" where Field='grp_cpus';
slurmctld: debug4: This could happen often and is expected.
mysql_query failed: 1146 Table 'slurm_acct_db.linux_assoc_table' doesn't exist
show columns from "linux_assoc_table" where Field='grp_cpus';
slurmctld: error: issue converting tables
slurmctld: Accounting storage MYSQL plugin failed
slurmctld: error: Couldn't load specified plugin name for accounting_storage/mysql: Plugin init() callback failed
slurmctld: error: cannot create accounting_storage context for accounting_storage/mysql
slurmctld: debug:  Association database appears down, reading from state file.
slurmctld: debug2: No association state file (/var/spool/assoc_mgr_state) to recover
slurmctld: fatal: You are running with a database but for some reason we have no TRES from it.  This should only happese is down and you don't have any state files.

4. The tables which were added to the database - 

mysql> show tables;
+-------------------------+
| Tables_in_slurm_acct_db |
+-------------------------+
| acct_coord_table        |
| acct_table              |
| clus_res_table          |
| cluster_table           |
| qos_table               |
| res_table               |
| table_defs_table        |
| tres_table              |
| txn_table               |
| user_table              |
+-------------------------+
10 rows in set (0.00 sec)
Comment 1 Jacob Jenson 2015-10-21 09:03:52 MDT
Please close these tickets.

Thanks,
Triveni
Comment 2 JM 2015-10-23 01:44:07 MDT
it seems that the fix wasn't included in the most recent release.. i facing the same exact issue..