Ticket 10295

Summary: Errors upgrading Slurm database from MariaDB 5.5 (CentOS 7) to 10.3 (CentOS 8)
Product: Slurm Reporter: Ole.H.Nielsen <Ole.H.Nielsen>
Component: DatabaseAssignee: Nate Rini <nate>
Status: RESOLVED INFOGIVEN QA Contact:
Severity: 4 - Minor Issue    
Priority: ---    
Version: 20.02.6   
Hardware: Linux   
OS: Linux   
Site: DTU Physics Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: CentOS
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---
Attachments: Output of mysqldump --all-databases --no-data

Description Ole.H.Nielsen@fysik.dtu.dk 2020-11-26 03:11:30 MST
I am testing migration of our Slurm database from a CentOS 7 server with MariaDB 5.5 to a CentOS 8 server with MariaDB 10.3 from the distro.  I think SchedMD recommends to use the newer MariaDB or MySQL version if possible, and the CentOS 8 distro has this one: 

$ rpm -q mariadb-server
mariadb-server-10.3.17-1.module_el8.1.0+257+48736ea6.x86_64

On the CentOS 8 test server I have loaded the database dump from my CentOS7/MariaDB 5.5 server into MariaDB 10.3 without problems:

$ zcat mysql_backup_file.gz | mysql -u root -p

I have built and installed Slurm 20.02.6 RPMs on the CentOS 8 server.

However, when I start slurmdbd I get some table corruption error messages:

$ slurmdbd -D -vvvv
slurmdbd: debug:  Log file re-opened
slurmdbd: debug3: Trying to load plugin /usr/lib64/slurm/auth_munge.so
slurmdbd: debug:  Munge authentication plugin loaded
slurmdbd: debug3: Success.
slurmdbd: debug3: Trying to load plugin /usr/lib64/slurm/accounting_storage_mysql.so
slurmdbd: debug2: mysql_connect() called for db slurm_acct_db
slurmdbd: debug2: Attempting to connect to localhost:3306
slurmdbd: MySQL server version is: 10.3.17-MariaDB
slurmdbd: debug2: innodb_buffer_pool_size: 1073741824
slurmdbd: debug2: innodb_log_file_size: 67108864
slurmdbd: debug2: innodb_lock_wait_timeout: 900
slurmdbd: error: mysql_query failed: 1805 Column count of mysql.proc is wrong. Expected 21, found 20. The table is probably corrupted
drop procedure if exists get_parent_limits; create procedure get_parent_limits(my_table text, acct text, cluster text, without_limits int) begin set @par_id = NULL; set @mj = NULL; set @mja = NULL; set @mpt = NULL; set @msj = NULL; set @mwpj = NULL; set @mtpj = ''; set @mtpn = ''; set @mtmpj = ''; set @mtrm = ''; set @prio = NULL; set @def_qos_id = NULL; set @qos = ''; set @delta_qos = ''; set @my_acct = acct; if without_limits then set @mj = 0; set @msj = 0; set @mwpj = 0; set @prio = 0; set @def_qos_id = 0; set @qos = 1; end if; REPEAT set @s = 'select '; if @par_id is NULL then set @s = CONCAT(@s, '@par_id := id_assoc, '); end if; if @mj is NULL then set @s = CONCAT(@s, '@mj := max_jobs, '); end if; if @mja is NULL then set @s = CONCAT(@s, '@mja := max_jobs_accrue, '); end if; if @mpt is NULL then set @s = CONCAT(@s, '@mpt := min_prio_thresh, '); end if; if @msj is NULL then set @s = CONCAT(@s, '@msj := max_submit_jobs, '); end if; if @mwpj is NULL then set @s = CONCAT(@s, '@mwpj := max_wall_pj, '); end if; if @prio is NULL then set @s = CONCAT(@s, '@prio := priority, '); end if; if @def_qos_id is NULL then set @s = CONCAT(@s, '@def_qos_id := def_qos_id, '); end if; if @qos = '' then set @s = CONCAT(@s, '@qos := qos, @delta_qos := REPLACE(CONCAT(delta_qos, @delta_qos), \',,\', \',\'), '); end if; set @s = concat(@s, '@mtpj := CONCAT(@mtpj, if (@mtpj != \'\' && max_tres_pj != \'\', \',\', \'\'), max_tres_pj), @mtpn := CONCAT(@mtpn, if (@mtpn != \'\' && max_tres_pn != \'\', \',\', \'\'), max_tres_pn), @mtmpj := CONCAT(@mtmpj, if (@mtmpj != \'\' && max_tres_mins_pj != \'\', \',\', \'\'), max_tres_mins_pj), @mtrm := CONCAT(@mtrm, if (@mtrm != \'\' && max_tres_run_mins != \'\', \',\', \'\'), max_tres_run_mins), @my_acct_new := parent_acct from "', cluster, '_', my_table, '" where acct = \'', @my_acct, '\' && user=\'\''); prepare query from @s; execute query; deallocate prepare query; set @my_acct = @my_acct_new; UNTIL without_limits || @my_acct = '' END REPEAT; END;
slurmdbd: error: mysql_query failed: 1805 Column count of mysql.proc is wrong. Expected 21, found 20. The table is probably corrupted
drop procedure if exists get_coord_qos; create procedure get_coord_qos(my_table text, acct text, cluster text, coord text) begin set @qos = ''; set @delta_qos = ''; set @found_coord = NULL; set @my_acct = acct; REPEAT set @s = 'select @qos := t1.qos, @delta_qos := REPLACE(CONCAT(t1.delta_qos, @delta_qos), \',,\', \',\'), @my_acct_new := parent_acct, @found_coord_curr := t2.user '; set @s = concat(@s, 'from "', cluster, '_', my_table, '" as t1 left outer join acct_coord_table as t2 on t1.acct=t2.acct where t1.acct = @my_acct && t1.user=\'\' && (t2.user=\'', coord, '\' || t2.user is null)'); prepare query from @s; execute query; deallocate prepare query; if @found_coord_curr is not NULL then set @found_coord = @found_coord_curr; end if; if @found_coord is NULL then set @qos = ''; set @delta_qos = ''; end if; set @my_acct = @my_acct_new; UNTIL @qos != '' || @my_acct = '' END REPEAT; select REPLACE(CONCAT(@qos, @delta_qos), ',,', ','); END;
slurmdbd: Accounting storage MYSQL plugin failed
slurmdbd: error: Couldn't load specified plugin name for accounting_storage/mysql: Plugin init() callback failed
slurmdbd: error: cannot create accounting_storage context for accounting_storage/mysql
slurmdbd: fatal: Unable to initialize accounting_storage/mysql accounting storage plugin 

Before starting slurmdbd I had already tried the mysql_upgrade recommended in https://mariadb.com/kb/en/mysql_upgrade/ but with the result that all should be OK:

$ mysql_upgrade -p
Enter password:
This installation of MySQL is already upgraded to 10.3.17-MariaDB, use --force if you still need to run mysql_upgrade 

Question: What is the recommended procedure for migrating a Slurm database from MariaDB 5.5 to 10.3?  Probably many sites will be looking at migrating from CentOS 7 to CentOS 8 over the next few years, so it's a timely question.

Does SchedMD recommend to run "mysql_upgrade --force" as a solution?

Thanks,
Ole
Comment 1 Nate Rini 2020-11-30 09:27:17 MST
(In reply to Ole.H.Nielsen@fysik.dtu.dk from comment #0)
> Question: What is the recommended procedure for migrating a Slurm database
> from MariaDB 5.5 to 10.3?  Probably many sites will be looking at migrating
> from CentOS 7 to CentOS 8 over the next few years, so it's a timely question.

Looks like the tables changed in this version. Please call the following:
> mariadb-dump --all-databases --no-data
Comment 2 Ole.H.Nielsen@fysik.dtu.dk 2020-11-30 11:24:57 MST
Hi Nate,

(In reply to Nate Rini from comment #1)
> (In reply to Ole.H.Nielsen@fysik.dtu.dk from comment #0)
> > Question: What is the recommended procedure for migrating a Slurm database
> > from MariaDB 5.5 to 10.3?  Probably many sites will be looking at migrating
> > from CentOS 7 to CentOS 8 over the next few years, so it's a timely question.
> 
> Looks like the tables changed in this version. Please call the following:
> > mariadb-dump --all-databases --no-data

On the original CentOS 7.9 database server I have done:
$ mysqldump --all-databases --no-data
The output log file is attached.

Thanks,
Ole
Comment 3 Ole.H.Nielsen@fysik.dtu.dk 2020-11-30 11:25:18 MST
Created attachment 16865 [details]
Output of mysqldump --all-databases --no-data
Comment 5 Nate Rini 2020-11-30 12:21:19 MST
Was the database upgraded in place from Centos7? Or was it exported using mysqldump and then added after calling upgrade?
Comment 6 Ole.H.Nielsen@fysik.dtu.dk 2020-11-30 13:40:58 MST
(In reply to Nate Rini from comment #5)
> Was the database upgraded in place from Centos7? Or was it exported using
> mysqldump and then added after calling upgrade?

No upgrading:  Two separate servers with CentOS 7 and 8, respectively, as described above in this case.  Dump made on CentOS 7, dump file copied to CentOS 8, dump loaded into pristine MariaDB 10.3 on CentOS 8.  In short, a migration.
Comment 7 Nate Rini 2020-11-30 13:50:43 MST
(In reply to Ole.H.Nielsen@fysik.dtu.dk from comment #6)
> (In reply to Nate Rini from comment #5)
> > Was the database upgraded in place from Centos7? Or was it exported using
> > mysqldump and then added after calling upgrade?
> 
> No upgrading:  Two separate servers with CentOS 7 and 8, respectively, as
> described above in this case.  Dump made on CentOS 7, dump file copied to
> CentOS 8, dump loaded into pristine MariaDB 10.3 on CentOS 8.  In short, a
> migration.

I just tested that locally and no issue with 10.3.17-MariaDB in Centos8. Did the dump include the "mysql" database?
Comment 8 Nate Rini 2020-11-30 13:54:20 MST
Looks like the mysql.procs table is different:

The aggregate field appears to be missing on your install:
> *************************** 21. row ***************************
>  Field: aggregate
>   Type: enum('NONE','GROUP')
>   Null: NO
>    Key: 
> Default: NONE
>  Extra:
Comment 9 Nate Rini 2020-11-30 13:56:52 MST
This is mysql.procs on my local install:

> CREATE TABLE `proc` (
>   `db` char(64) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL DEFAULT '',
>   `name` char(64) NOT NULL DEFAULT '',
>   `type` enum('FUNCTION','PROCEDURE','PACKAGE','PACKAGE BODY') NOT NULL,
>   `specific_name` char(64) NOT NULL DEFAULT '',
>   `language` enum('SQL') NOT NULL DEFAULT 'SQL',
>   `sql_data_access` enum('CONTAINS_SQL','NO_SQL','READS_SQL_DATA','MODIFIES_SQL_DATA') NOT NULL DEFAULT 'CONTAINS_SQL',
>   `is_deterministic` enum('YES','NO') NOT NULL DEFAULT 'NO',
>   `security_type` enum('INVOKER','DEFINER') NOT NULL DEFAULT 'DEFINER',
>   `param_list` blob NOT NULL,
>   `returns` longblob NOT NULL,
>   `body` longblob NOT NULL,
>   `definer` char(141) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL DEFAULT '',
>   `created` timestamp NOT NULL DEFAULT current_timestamp() ON UPDATE current_timestamp(),
>   `modified` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
>   `sql_mode` set('REAL_AS_FLOAT','PIPES_AS_CONCAT','ANSI_QUOTES','IGNORE_SPACE','IGNORE_BAD_TABLE_OPTIONS','ONLY_FULL_GROUP_BY','NO_UNSIGNED_SUBTRACTION','NO_DIR_IN_CREATE','POSTGRESQL','ORACLE','MSSQL','DB2','MAXDB','NO_KEY_OPTIONS','NO_TABLE_OPTIONS','NO_FIELD_OPTIONS','MYSQL323','MYSQL40','ANSI','NO_AUTO_VALUE_ON_ZERO','NO_BACKSLASH_ESCAPES','STRICT_TRANS_TABLES','STRICT_ALL_TABLES','NO_ZERO_IN_DATE','NO_ZERO_DATE','INVALID_DATES','ERROR_FOR_DIVISION_BY_ZERO','TRADITIONAL','NO_AUTO_CREATE_USER','HIGH_NOT_PRECEDENCE','NO_ENGINE_SUBSTITUTION','PAD_CHAR_TO_FULL_LENGTH','EMPTY_STRING_IS_NULL','SIMULTANEOUS_ASSIGNMENT') NOT NULL DEFAULT '',
>   `comment` text CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
>   `character_set_client` char(32) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
>   `collation_connection` char(32) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
>   `db_collation` char(32) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
>   `body_utf8` longblob DEFAULT NULL,
>   `aggregate` enum('NONE','GROUP') NOT NULL DEFAULT 'NONE',
>   PRIMARY KEY (`db`,`name`,`type`)
> ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COMMENT='Stored Procedures';

I suggest recreating this table or redoing the upgrade as MySQL upgrade should repair this kind of issue.
Comment 10 Ole.H.Nielsen@fysik.dtu.dk 2020-12-01 00:30:00 MST
This is my database backup command on the current production server running CentOS 7.9 with MariaDB 5.5:

$ /usr/bin/mysqldump --opt --flush-logs --quote-names --events --all-databases

Do you think anything is wrong or missing in this dump command?
Comment 11 Ole.H.Nielsen@fysik.dtu.dk 2020-12-01 00:42:38 MST
(In reply to Nate Rini from comment #8)
> Looks like the mysql.procs table is different:
> 
> The aggregate field appears to be missing on your install:
> > *************************** 21. row ***************************
> >  Field: aggregate
> >   Type: enum('NONE','GROUP')
> >   Null: NO
> >    Key: 
> > Default: NONE
> >  Extra:

Thanks for this analysis.  But I have no clue what this is all about, since our current Slurm production database has been running for about 4 years on two subsequent CentOS 7 servers with MariaDB 5.5.  If anything is missing, would that date back to the original Slurm 16.x or 17.x installation?  I don't mess directly with the database because I'm no expert on databases.
Comment 12 Ole.H.Nielsen@fysik.dtu.dk 2020-12-01 00:46:00 MST
(In reply to Nate Rini from comment #8)
> Looks like the mysql.procs table is different:
> 
> The aggregate field appears to be missing on your install:
> > *************************** 21. row ***************************
> >  Field: aggregate
> >   Type: enum('NONE','GROUP')
> >   Null: NO
> >    Key: 
> > Default: NONE
> >  Extra:

Thanks for this analysis.  But I have no clue what this is all about, since our current Slurm production database has been running for about 4 years on two subsequent CentOS 7 servers with MariaDB 5.5.  If anything is missing, would that date back to the original Slurm 16.x or 17.x installation?  I don't mess directly with the database because I'm no expert on databases.
Comment 13 Ole.H.Nielsen@fysik.dtu.dk 2020-12-01 00:50:40 MST
(In reply to Nate Rini from comment #9)
> This is mysql.procs on my local install:
> 
> > CREATE TABLE `proc` (
> >   `db` char(64) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL DEFAULT '',
> >   `name` char(64) NOT NULL DEFAULT '',
> >   `type` enum('FUNCTION','PROCEDURE','PACKAGE','PACKAGE BODY') NOT NULL,
> >   `specific_name` char(64) NOT NULL DEFAULT '',
> >   `language` enum('SQL') NOT NULL DEFAULT 'SQL',
> >   `sql_data_access` enum('CONTAINS_SQL','NO_SQL','READS_SQL_DATA','MODIFIES_SQL_DATA') NOT NULL DEFAULT 'CONTAINS_SQL',
> >   `is_deterministic` enum('YES','NO') NOT NULL DEFAULT 'NO',
> >   `security_type` enum('INVOKER','DEFINER') NOT NULL DEFAULT 'DEFINER',
> >   `param_list` blob NOT NULL,
> >   `returns` longblob NOT NULL,
> >   `body` longblob NOT NULL,
> >   `definer` char(141) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL DEFAULT '',
> >   `created` timestamp NOT NULL DEFAULT current_timestamp() ON UPDATE current_timestamp(),
> >   `modified` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
> >   `sql_mode` set('REAL_AS_FLOAT','PIPES_AS_CONCAT','ANSI_QUOTES','IGNORE_SPACE','IGNORE_BAD_TABLE_OPTIONS','ONLY_FULL_GROUP_BY','NO_UNSIGNED_SUBTRACTION','NO_DIR_IN_CREATE','POSTGRESQL','ORACLE','MSSQL','DB2','MAXDB','NO_KEY_OPTIONS','NO_TABLE_OPTIONS','NO_FIELD_OPTIONS','MYSQL323','MYSQL40','ANSI','NO_AUTO_VALUE_ON_ZERO','NO_BACKSLASH_ESCAPES','STRICT_TRANS_TABLES','STRICT_ALL_TABLES','NO_ZERO_IN_DATE','NO_ZERO_DATE','INVALID_DATES','ERROR_FOR_DIVISION_BY_ZERO','TRADITIONAL','NO_AUTO_CREATE_USER','HIGH_NOT_PRECEDENCE','NO_ENGINE_SUBSTITUTION','PAD_CHAR_TO_FULL_LENGTH','EMPTY_STRING_IS_NULL','SIMULTANEOUS_ASSIGNMENT') NOT NULL DEFAULT '',
> >   `comment` text CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
> >   `character_set_client` char(32) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
> >   `collation_connection` char(32) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
> >   `db_collation` char(32) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
> >   `body_utf8` longblob DEFAULT NULL,
> >   `aggregate` enum('NONE','GROUP') NOT NULL DEFAULT 'NONE',
> >   PRIMARY KEY (`db`,`name`,`type`)
> > ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COMMENT='Stored Procedures';
> 
> I suggest recreating this table or redoing the upgrade as MySQL upgrade
> should repair this kind of issue.

Thanks for the suggestion.  Two problems:

1. I have no idea what it takes "recreating this table"?

2. I don't do any upgrades.  I migrate the database dump from CentOS 7 to CentOS 8.  I have done this several times, but I get the same error shown at the top.  The mysql_upgrade command does nothing at all.

Can you please propose specific and detailed instructions for migrating the database from MariaDB 5.5 to 10.3?

Thanks,
Ole
Comment 14 Nate Rini 2020-12-01 08:47:45 MST
(In reply to Ole.H.Nielsen@fysik.dtu.dk from comment #12)
> (In reply to Nate Rini from comment #8)
> > Looks like the mysql.procs table is different:
> If anything is missing, would that date back to the original Slurm 16.x or 17.x installation?  

Looks like MariaDB/MySQL devs changed the structure of that table. This isn't directly related to Slurm but a consequence of the export/import of the old database.

> I don't mess directly with the database because I'm no expert on databases.
Understood, I will provide more verbose instructions.

(In reply to Ole.H.Nielsen@fysik.dtu.dk from comment #13)
> 1. I have no idea what it takes "recreating this table"?
The tables inside of the mysq

 
> 2. I don't do any upgrades.  I migrate the database dump from CentOS 7 to
> CentOS 8.  I have done this several times, but I get the same error shown at
> the top.  The mysql_upgrade command does nothing at all.
> 
> Can you please propose specific and detailed instructions for migrating the
> database from MariaDB 5.5 to 10.3?
> 
> Thanks,
> Ole
Comment 15 Nate Rini 2020-12-01 08:50:22 MST
My last reply got sent early when I sneezed.

(In reply to Nate Rini from comment #14)
> (In reply to Ole.H.Nielsen@fysik.dtu.dk from comment #13)
> > 1. I have no idea what it takes "recreating this table"?
> The tables inside of the mysq

The tables inside of the mysql database are just tables like any other. They can be created and changed by normal commands but MySQL itself will directly reference them.

I suspect when the database was loaded, it replaced the newer mysql.procs tables which is the cause of the current issue.
 
> > 2. I don't do any upgrades.  I migrate the database dump from CentOS 7 to
> > CentOS 8.  I have done this several times, but I get the same error shown at
> > the top.  The mysql_upgrade command does nothing at all.
That is probably a bug with mysql_upgrade but we can work around it.

> > Can you please propose specific and detailed instructions for migrating the
> > database from MariaDB 5.5 to 10.3?
Yes, in the next reply.
Comment 16 Nate Rini 2020-12-01 09:13:31 MST
(In reply to Nate Rini from comment #15)
> > > Can you please propose specific and detailed instructions for migrating the
> > > database from MariaDB 5.5 to 10.3?
> Yes, in the next reply.

(In reply to Ole.H.Nielsen@fysik.dtu.dk from comment #10)
> This is my database backup command on the current production server running
> CentOS 7.9 with MariaDB 5.5:
> 
> $ /usr/bin/mysqldump --opt --flush-logs --quote-names --events
> --all-databases
> 
> Do you think anything is wrong or missing in this dump command?

I think the simplest solution would be to only export the Slurm database.

Please try this instead:
> $ /usr/bin/mysqldump --single-transaction -B slurm_acct_db 

This output should then be used on a clean install of mysql.

The easiest way to do this should be to uninstall mariadb on the new server and then to remove the /var/lib/mysql/* entirely. Then install mariadb again and it should populate the directory on the first run. After that, just dump the dump into the mysql command and it should work.
Comment 17 Ole.H.Nielsen@fysik.dtu.dk 2020-12-02 03:25:11 MST
Hi Nate,

Thanks for recommending some different parameters for mysqldump:

(In reply to Nate Rini from comment #16)
> I think the simplest solution would be to only export the Slurm database.
> 
> Please try this instead:
> > $ /usr/bin/mysqldump --single-transaction -B slurm_acct_db 
> 
> This output should then be used on a clean install of mysql.

I have made the recommended mysqldump on the production CentOS 7.9 server and copied it to my testing server with CentOS 8.2.  I have loaded the dump file into a freshly installed mariadb-server:

$ zcat mysqldump_slurm_acct_db.gz | mysql -u root -p

Now slurmdbd starts up without any errors:

slurmdbd -D -vvvv
slurmdbd: debug:  Log file re-opened
slurmdbd: Killing old slurmdbd[83547]
slurmdbd: debug3: Trying to load plugin /usr/lib64/slurm/auth_munge.so
slurmdbd: debug:  Munge authentication plugin loaded
slurmdbd: debug3: Success.
slurmdbd: debug3: Trying to load plugin /usr/lib64/slurm/accounting_storage_mysql.so
slurmdbd: debug2: mysql_connect() called for db slurm_acct_db
slurmdbd: debug2: Attempting to connect to localhost:3306
slurmdbd: MySQL server version is: 10.3.17-MariaDB
slurmdbd: debug2: innodb_buffer_pool_size: 1073741824
slurmdbd: debug2: innodb_log_file_size: 67108864
slurmdbd: debug2: innodb_lock_wait_timeout: 900
slurmdbd: Accounting storage MYSQL plugin loaded
slurmdbd: debug3: Success.
(lines deleted)
slurmdbd: slurmdbd version 20.02.6 started
slurmdbd: debug2: running rollup at Wed Dec 02 11:05:27 2020
slurmdbd: debug2: Attempting to connect to localhost:3306
slurmdbd: debug2: No need to roll cluster niflheim this day 1606863600 <= 1606863600
slurmdbd: debug2: No need to roll cluster niflheim this month 1606777200 <= 1606777200
slurmdbd: debug2: Got 1 of 1 rolled up
slurmdbd: debug2: Everything rolled up
^Cslurmdbd: Terminate signal (SIGINT or SIGTERM) received
slurmdbd: debug:  rpc_mgr shutting down
slurmdbd: Unable to remove pidfile '/var/run/slurmdbd.pid': No such file or directory
slurmdbd: debug3: starting mysql cleaning up
slurmdbd: debug3: finished mysql cleaning up

Question 1: Are there any caveats to using the "mysqldump --single-transaction" with a Slurm database?

In the mysqldump manual page I read this information:

       ยท   --single-transaction

           This option sends a START TRANSACTION SQL statement to the server
           before dumping data. It is useful only with transactional tables
           such as InnoDB, because then it dumps the consistent state of the
           database at the time when BEGIN was issued without blocking any
           applications.

           When using this option, you should keep in mind that only InnoDB
           tables are dumped in a consistent state. For example, any MyISAM or
           MEMORY tables dumped while using this option may still change
           state.

Question 2: I have previously used the mysqldump options "--flush-logs --quote-names --events".  Is it still a good idea to use those?

Thanks a lot,
Ole
Comment 18 Nate Rini 2020-12-02 10:32:22 MST
(In reply to Ole.H.Nielsen@fysik.dtu.dk from comment #17)
> Question 1: Are there any caveats to using the "mysqldump
> --single-transaction" with a Slurm database?

This is recommended to avoid race conditions when slurmdbd is being run while taking the MySQL dump.

> Question 2: I have previously used the mysqldump options "--flush-logs
> --quote-names --events".  Is it still a good idea to use those?
There is no requirement for them for any of them when dumping the Slurm database (unless your MySQL server is sharded which I saw no evidence of in your logs).
Comment 19 Ole.H.Nielsen@fysik.dtu.dk 2020-12-03 04:29:25 MST
(In reply to Nate Rini from comment #18)
> (In reply to Ole.H.Nielsen@fysik.dtu.dk from comment #17)
> > Question 1: Are there any caveats to using the "mysqldump
> > --single-transaction" with a Slurm database?
> 
> This is recommended to avoid race conditions when slurmdbd is being run
> while taking the MySQL dump.

Thanks for the recommendation. The --single-transaction explanations from the man-page in Comment 17 refer to InnoDB tables, whereas other types of tables may still be inconsistently dumped.

Can you kindly explain whether the Slurm database only uses InnoDB, and how we can be assured of a consistent Slurm database dump?  Again, I apologize for not being a database expert.

Thanks,
Ole
Comment 20 Nate Rini 2020-12-07 13:33:03 MST
(In reply to Ole.H.Nielsen@fysik.dtu.dk from comment #19)
> Can you kindly explain whether the Slurm database only uses InnoDB
Slurm only uses InnoDB tables.

> how we can be assured of a consistent Slurm database dump?
For Innodb tables, using '--single-transaction' ensures that.

>   Again, I apologize
Please feel free to ask Slurm questions: that is the reason for us being here.

> for not being a database expert.
Please note that this suggestion does not apply to other databases housed along side of the Slurm databases (on the same MySQL server).
Comment 21 Ole.H.Nielsen@fysik.dtu.dk 2020-12-08 01:15:51 MST
Hi Nate,

Thanks a lot for these clear answers:

(In reply to Nate Rini from comment #20)
> (In reply to Ole.H.Nielsen@fysik.dtu.dk from comment #19)
> > Can you kindly explain whether the Slurm database only uses InnoDB
> Slurm only uses InnoDB tables.
> 
> > how we can be assured of a consistent Slurm database dump?
> For Innodb tables, using '--single-transaction' ensures that.
> 
> >   Again, I apologize
> Please feel free to ask Slurm questions: that is the reason for us being
> here.
> 
> > for not being a database expert.
> Please note that this suggestion does not apply to other databases housed
> along side of the Slurm databases (on the same MySQL server).

I think we may close this case now.

Best regards,
Ole