Bug 12558 - Cannot delete account after recent migration to 20.11.8 from 20.02
Summary: Cannot delete account after recent migration to 20.11.8 from 20.02
Status: RESOLVED INVALID
Alias: None
Product: Slurm
Classification: Unclassified
Component: Database (show other bugs)
Version: 20.11.8
Hardware: Linux Linux
: --- 6 - No support contract
Assignee: Jacob Jenson
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2021-09-28 04:17 MDT by Dane Kennedy
Modified: 2022-03-04 08:57 MST (History)
3 users (show)

See Also:
Site: -Other-
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Dane Kennedy 2021-09-28 04:17:33 MDT
We recently upgraded to 20.11.8 from 20.02.x (But this database has been through several migrations before this) and we are having issues deleting accounts and associations. It appears as if the database migration has left out setting the default value for max_tres_pj.

Specifically when trying to delete an association it fails and in the slurmdbd.log we're seeing entries like this:

[2021-09-28T10:55:42.472] error: mysql_query failed: 1364 Field 'max_tres_pj' doesn't have a default value
update "ilifu-slurm_assoc_table" as t1 set mod_time=1632819342, deleted=1, def_qos_id=DEFAULT, shares=DEFAULT, max_jobs=DEFAULT, max_jobs_accrue=DEFAULT, min_prio_thresh=DEFAULT, max_submit_jobs=DEFAULT, max_wall_pj=DEFAULT, max_tres_pj=DEFAULT, max_tres_pn=DEFAULT, max_tres_mins_pj=DEFAULT, max_tres_run_mins=DEFAULT, grp_jobs=DEFAULT, grp_submit_jobs=DEFAULT, grp_jobs_accrue=DEFAULT, grp_wall=DEFAULT, grp_tres=DEFAULT, grp_tres_mins=DEFAULT, grp_tres_run_mins=DEFAULT, qos=DEFAULT, delta_qos=DEFAULT, priority=DEFAULT where (id_assoc=1340 || id_assoc=1352);

Deleting accounts fails as well with this:

$ sacctmgr delete account where  Account=what_what
sacctmgr: accounting_storage/slurmdbd: acct_storage_p_remove_accts: No error
 Nothing deleted
$ sacctmgr show account what_what
   Account                Descr                  Org 
---------- -------------------- -------------------- 
what-what         not_important            something 


Regards,
Dane.
Comment 1 Jacob Jenson 2021-09-28 10:39:36 MDT
Dane, 

I am sorry you have encountered this issue. The SchedMD support team can help you resolve this issue, but first your site will need to purchase a support contact. Is IDIA open to purchasing support? 

Thank you,
Jacob
Comment 2 Dane Kennedy 2021-09-29 06:58:17 MDT
Hi Jacob,

Thank you very much, unfortunately we're a relatively small academic
organisation and I don't think we can afford a support contract.

Fortunately I managed to check your open source code and found various
columns that needed a default empty string set on those columns, so I
manually updated those and everything is functional again now.

Thank you for a wonderful product,
Dane.

On Tue, Sep 28, 2021 at 6:39 PM <bugs@schedmd.com> wrote:

> *Comment # 1 <https://bugs.schedmd.com/show_bug.cgi?id=12558#c1> on bug
> 12558 <https://bugs.schedmd.com/show_bug.cgi?id=12558> from Jacob Jenson
> <jacob@schedmd.com> *
>
> Dane,
>
> I am sorry you have encountered this issue. The SchedMD support team can help
> you resolve this issue, but first your site will need to purchase a support
> contact. Is IDIA open to purchasing support?
>
> Thank you,
> Jacob
>
> ------------------------------
> You are receiving this mail because:
>
>    - You reported the bug.
>
>
Comment 3 Dane Kennedy 2021-09-29 07:01:04 MDT
MariaDB [slurm_acct_db]> ALTER TABLE `CLUSTER_NAME_assoc_table` ALTER max_tres_pj SET DEFAULT '';
MariaDB [slurm_acct_db]> ALTER TABLE `CLUSTER_NAME_assoc_table` ALTER max_tres_pn SET DEFAULT '';
MariaDB [slurm_acct_db]> ALTER TABLE `CLUSTER_NAME_assoc_table` ALTER max_tres_mins_pj SET DEFAULT '';
MariaDB [slurm_acct_db]> ALTER TABLE `CLUSTER_NAME_assoc_table` ALTER max_tres_run_mins SET DEFAULT '';
MariaDB [slurm_acct_db]> ALTER TABLE `CLUSTER_NAME_assoc_table` ALTER grp_tres SET DEFAULT '';
MariaDB [slurm_acct_db]> ALTER TABLE `CLUSTER_NAME_assoc_table` ALTER grp_tres_mins SET DEFAULT '';
MariaDB [slurm_acct_db]> ALTER TABLE `CLUSTER_NAME_assoc_table` ALTER grp_tres_run_mins SET DEFAULT '';
MariaDB [slurm_acct_db]> ALTER TABLE `CLUSTER_NAME_assoc_table` ALTER qos SET DEFAULT '';
MariaDB [slurm_acct_db]> ALTER TABLE `CLUSTER_NAME_assoc_table` ALTER delta_qos SET DEFAULT '';

← the fix :-).
Comment 4 Stephan Roth 2021-10-19 05:53:31 MDT
I ran into the same problem, but circumvented it by allowing NULL for the afflicted columns:

ALTER TABLE `<CLUSTER_NAME>_assoc_table` CHANGE `max_tres_pj` `max_tres_pj` TEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL;
ALTER TABLE `<CLUSTER_NAME>_assoc_table` CHANGE `max_tres_pn` `max_tres_pn` TEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL;
ALTER TABLE `<CLUSTER_NAME>_assoc_table` CHANGE `max_tres_mins_pj` `max_tres_mins_pj` TEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL;
ALTER TABLE `<CLUSTER_NAME>_assoc_table` CHANGE `max_tres_run_mins` `max_tres_run_mins` TEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL;
ALTER TABLE `<CLUSTER_NAME>_assoc_table` CHANGE `grp_tres` `grp_tres` TEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL;
ALTER TABLE `<CLUSTER_NAME>_assoc_table` CHANGE `grp_tres_mins` `grp_tres_mins` TEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL;
ALTER TABLE `<CLUSTER_NAME>_assoc_table` CHANGE `grp_tres_run_mins` `grp_tres_run_mins` TEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NULL;
ALTER TABLE `<CLUSTER_NAME>_assoc_table` CHANGE `qos` `qos` BLOB NULL;
ALTER TABLE `<CLUSTER_NAME>_assoc_table` CHANGE `delta_qos` `delta_qos` BLOB NULL;

My reason for this fix is, according to https://dev.mysql.com/doc/refman/5.7/en/data-type-defaults.html

"The BLOB, TEXT, GEOMETRY, and JSON data types cannot be assigned a default value."


So im my opinion the query generated when deleting a user:

update "<CLUSTER_NAME>_assoc_table" as t1 set mod_time=1634639485, deleted=1, def_qos_id=DEFAULT, shares=DEFAULT, max_jobs=DEFAULT, max_jobs_accrue=DEFAULT, min_prio_thresh=DEFAULT, max_submit_jobs=DEFAULT, max_wall_pj=DEFAULT, max_tres_pj=DEFAULT, max_tres_pn=DEFAULT, max_tres_mins_pj=DEFAULT, max_tres_run_mins=DEFAULT, grp_jobs=DEFAULT, grp_submit_jobs=DEFAULT, grp_jobs_accrue=DEFAULT, grp_wall=DEFAULT, grp_tres=DEFAULT, grp_tres_mins=DEFAULT, grp_tres_run_mins=DEFAULT, qos=DEFAULT, delta_qos=DEFAULT, priority=DEFAULT where (id_assoc=<ASSOC_ID>);

is incorrect in trying to set DEFAULT values for the TEXT and BLOB columns for this version of MySQL.
Comment 5 Stephan Roth 2021-11-01 09:25:12 MDT
Ignore what I called a "fix" above.

slurmdbd crashes with these changes when adding/removing users from Slurm accounts with sacctmgr
Comment 6 Bas van der Vlies 2021-12-02 04:22:31 MST
We have this issue as well. Is there a fix? we have a support contract
Comment 7 Bas van der Vlies 2021-12-03 01:36:33 MST
Just for the reference the right solution was also in this ticket and also mentioned here:
 * https://bugs.schedmd.com/show_bug.cgi?id=1294
Comment 8 Kilian Cavalotti 2022-03-04 08:57:34 MST
(In reply to Bas van der Vlies from comment #7)
> Just for the reference the right solution was also in this ticket and also
> mentioned here:
>  * https://bugs.schedmd.com/show_bug.cgi?id=1294

For future reference, if anyone's looking at this, I believe the actual bug referencing that issue is https://bugs.schedmd.com/show_bug.cgi?id=12947

Cheers,
--
Kilian