Ticket 17418 - slurm_load_partitions: Unexpected message received
Summary: slurm_load_partitions: Unexpected message received
Status: RESOLVED FIXED
Alias: None
Product: Slurm
Classification: Unclassified
Component: User Commands (show other tickets)
Version: 23.02.4
Hardware: Linux Linux
: --- 4 - Minor Issue
Assignee: Megan Dahl
QA Contact: Ben Roberts
URL:
Depends on:
Blocks:
 
Reported: 2023-08-14 02:15 MDT by Ole.H.Nielsen@fysik.dtu.dk
Modified: 2023-10-06 00:04 MDT (History)
0 users

See Also:
Site: DTU Physics
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed: 23.02.5
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments
slurm.conf from the 22.05.9 controller (7.27 KB, text/plain)
2023-08-14 02:17 MDT, Ole.H.Nielsen@fysik.dtu.dk
Details

Note You need to log in before you can comment on or make changes to this ticket.
Description Ole.H.Nielsen@fysik.dtu.dk 2023-08-14 02:15:45 MDT
We have upgraded our slurmdbd server to 23.02.4 while the controller and the rest of the cluster is still running 22.05.9.

On the 23.02.4 all user commands seem to fail when querying the slurmctld, for example:

$ sinfo
slurm_load_partitions: Unexpected message received
$ squeue
slurm_load_jobs error: Unexpected message received
$ scontrol show partitions
slurm_load_partitions error: Unexpected message received

These commands work correctly on the 22.05.9 nodes.

Can you please help us getting the 23.02.4 commands to work with the 22.05.9 controller?

Thanks,
Ole
Comment 1 Ole.H.Nielsen@fysik.dtu.dk 2023-08-14 02:17:48 MDT
Created attachment 31749 [details]
slurm.conf from the 22.05.9 controller
Comment 3 Megan Dahl 2023-08-14 10:53:00 MDT
Hello Ole,

Unfortunately, newer client commands can not be used with older versions. The RPC versioning will be something the older slurmctld does not understand or know how to handle. You will have to either upgrade the slurmctld or use older client commands with your current slurmctld version.

https://slurm.schedmd.com/quickstart_admin.html#upgrade

> The slurmctld daemon must also be upgraded before or at the same time as the
> slurmd daemons on the compute nodes. Generally, upgrading Slurm on all of the
> login and compute nodes is recommended, although rolling upgrades are also
> possible (i.e. upgrading the head node(s) first then upgrading the compute and
> login nodes later at various times). Also see the note above about reverse
> compatibility.

Regards,
--Megan
Comment 4 Ole.H.Nielsen@fysik.dtu.dk 2023-08-14 13:31:32 MDT
Hi Megan,

Thanks for the clarification:

(In reply to Megan Dahl from comment #3)
> Unfortunately, newer client commands can not be used with older versions.
> The RPC versioning will be something the older slurmctld does not understand
> or know how to handle. You will have to either upgrade the slurmctld or use
> older client commands with your current slurmctld version.

Unfortunately, this restriction seems to be undocumented: The inability of new version user commands to communicate with an older version of slurmctld.  The upgrade page currently doesn't cover this case:

> https://slurm.schedmd.com/quickstart_admin.html#upgrade

Could you kindly add documentation of the lack of backwards compatibility of user commands with an older version slurmctld?

Thanks,
Ole
Comment 6 Megan Dahl 2023-08-14 14:10:41 MDT
Hi Ole,

I will work on including the restriction into the documentation. I will let you know when it has been added.

Thanks,
--Megan
Comment 8 Megan Dahl 2023-10-05 13:10:52 MDT
Hello Ole,

The clarification that newer client commands can not be used with older versions has been added to the documentation in the following commit.

commit ce51a8fa6ecdbaef2fc7e2077db3daff934fb9cd
Author: Megan Dahl <megan@schedmd.com>
Date:   Mon Aug 14 13:48:05 2023 -0600

    Docs - Indicate that slurmctld must be upgraded before client commands
    
    Bug 17418

This will be available in 23.02.6.

Regards,
--Megan
Comment 9 Ole.H.Nielsen@fysik.dtu.dk 2023-10-06 00:04:21 MDT
Hi Megan,

(In reply to Megan Dahl from comment #8)
> The clarification that newer client commands can not be used with older
> versions has been added to the documentation in the following commit.
> 
> commit ce51a8fa6ecdbaef2fc7e2077db3daff934fb9cd
> Author: Megan Dahl <megan@schedmd.com>
> Date:   Mon Aug 14 13:48:05 2023 -0600
> 
>     Docs - Indicate that slurmctld must be upgraded before client commands
>     
>     Bug 17418
> 
> This will be available in 23.02.6.

Thanks a lot!

Ole