Ticket 4995 - Specifying --cores-per-socket prevents using more cores than are on a single socket
Summary: Specifying --cores-per-socket prevents using more cores than are on a single ...
Status: RESOLVED FIXED
Alias: None
Product: Slurm
Classification: Unclassified
Component: User Commands (show other tickets)
Version: 17.11.5
Hardware: Linux Linux
: --- 4 - Minor Issue
Assignee: Dominik Bartkiewicz
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2018-03-27 19:00 MDT by Christopher Samuel
Modified: 2018-05-10 03:39 MDT (History)
1 user (show)

See Also:
Site: Swinburne
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed: 17.11.6
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments
slurm.conf (4.38 KB, text/plain)
2018-03-29 22:51 MDT, Christopher Samuel
Details

Note You need to log in before you can comment on or make changes to this ticket.
Description Christopher Samuel 2018-03-27 19:00:37 MDT
Hi there,

The bulk of our cluster is comprised of dual socket, 18 core Skylake nodes with dual GPUs.  As we reserve 4 cores per node for GPU jobs only we effectively have 16 cores per socket available for non-GPU jobs.

In an effort to make Slurm allocate 16 cores per socket to all non-GPU jobs (rather than use all 18 cores on one socket and 14 on the other, thus meaning all the GPU specific cores are on one socket) I was trying to run a job with:

$ srun -c 32 --cores-per-socket=16 hostname
srun: error: Unable to allocate resources: Requested node configuration is not available

This is wrong, as the nodes definitely have more than 16 cores per socket available.   Testing further shows that this works:

$ sinteractive -c 18 --cores-per-socket=16
srun: job 51995 queued and waiting for resources
srun: job 51995 has been allocated resources
$ nproc
18

But this doesn't:

$ sinteractive -c 19 --cores-per-socket=16
srun: Force Terminated job 51994
srun: error: Unable to allocate resources: Requested node configuration is not available

But this does work:

[csamuel@farnarkle1 tmp]$ srun -c 19 nproc
srun: job 51998 queued and waiting for resources
srun: job 51998 has been allocated resources
19

As far as I can tell that's a bug, there's no reason Slurm shouldn't be able to accept those jobs as the nodes do have more than the requested 16 cores, and even odder that the failure is when you exceed the actual number of cores per socket, rather than the requested number.

Interestingly this might be a regression since 16.05.x as a cluster I have access to (2 sockets, 16 cores a node) with that version does not show that issue.

$ srun --version
slurm 16.05.8

$ srun -c 19 nproc
srun: job 3805919 queued and waiting for resources
srun: job 3805919 has been allocated resources
19

$ srun -c 19 --cores-per-socket=16 nproc
srun: job 3805920 queued and waiting for resources
srun: job 3805920 has been allocated resources
19

Another cluster with 17.02.9 on also works as expected, so the regression is after that version.

I'll open a separate bug about getting jobs allocated 16 cores per socket,
rather than only selecting nodes with at least that number of cores.

All the best,
Chris
Comment 1 Dominik Bartkiewicz 2018-03-29 10:05:43 MDT
Hi

Could you send me current slurm.conf and your sinteractive script. 

Dominik
Comment 2 Christopher Samuel 2018-03-29 22:51:17 MDT
Created attachment 6507 [details]
slurm.conf

Hi Dominik,

On Friday, 30 March 2018 3:05:43 AM AEDT you wrote:

> Could you send me current slurm.conf and your sinteractive script.

sinteractive is just: 

#!/bin/bash
exec srun $* --pty -u ${SHELL} -i -l

I've attached our slurm.conf

All the best,
Chris
Comment 3 Dominik Bartkiewicz 2018-03-30 04:38:09 MDT
Hi

Thanks, I can reproduce this now.
This behavior is effect of interaction '--cores-per-socket' with MaxCPUsPerNode.

Dominik
Comment 4 Christopher Samuel 2018-03-30 06:50:28 MDT
Hi Dominik,

On Friday, 30 March 2018 9:38:09 PM AEDT bugs@schedmd.com wrote:

> Thanks, I can reproduce this now.
> This behavior is effect of interaction '--cores-per-socket' with
> MaxCPUsPerNode.

Ah yes, I hadn't thought to test it on the other partitions we have and yes, 
that works. 

So definitely a bug then?

cheers!
Chris
Comment 5 Dominik Bartkiewicz 2018-03-30 07:51:00 MDT
Hi

I think this is a bug. 
I don’t know yet how to fix this with current MaxCPUsPerNode implementation.
From my observation this is not working on 16.05 too, are you observing the same thing?

Dominik
Comment 8 Christopher Samuel 2018-03-30 16:15:23 MDT
Hiya,

On Saturday, 31 March 2018 12:51:00 AM AEDT you wrote:

> I think this is a bug.

Yeah, looks like it to me too.

> I don’t know yet how to fix this with current MaxCPUsPerNode implementation.
> From my observation this is not working on 16.05 too, are you observing the
> same thing?

This system was brought up with 17.11.0 on it, the other systems I mentioned 
are just ones I have access to as a user and they don't have MaxCPUsPerNode 
set on any partitions.

All the best,
Chris
Comment 18 Dominik Bartkiewicz 2018-05-09 08:51:53 MDT
Hi

This patch solved this issue https://github.com/SchedMD/slurm/commit/6de8c831ae9c388.
It will be in 17.11.6
Let me know if this works on your hardware exactly as you expected.

Dominik
Comment 19 Christopher Samuel 2018-05-09 18:09:16 MDT
On 10/05/18 00:51, bugs@schedmd.com wrote:

> Hi

Hi Dominik,

> This patch solved this issue 
> https://github.com/SchedMD/slurm/commit/6de8c831ae9c388. It will be
> in 17.11.6 Let me know if this works on your hardware exactly as you
> expected.

Thanks so much for this, this is my last day in the office before I head
out on leave for 3 weeks so I'll try and get it done today (have some
meetings too).

If I can't I'll let you know in June when I'm back in Australia.

All the best!
Chris
Comment 20 Christopher Samuel 2018-05-09 20:21:43 MDT
On 10/05/18 10:08, Christopher Samuel wrote:

> Thanks so much for this, this is my last day in the office before I
> head out on leave for 3 weeks so I'll try and get it done today (have
> some meetings too).

Tested and looking good, thank you!

[csamuel@farnarkle1 pt2pt]$ srun -c 32 --cores-per-socket=16 hostname
srun: job 126213 queued and waiting for resources

No longer gives me an error and blocks waiting for a node with that 
config to become available.

If I put that into our debug partition then it picks one of our KNL 
nodes (our default is the Skylake node partition) which is available.

[csamuel@farnarkle1 pt2pt]$ srun -p debug -c 32 --cores-per-socket=16 
hostname
srun: job 126214 queued and waiting for resources
srun: job 126214 has been allocated resources
gina4

Very much appreciated..

All the best,
Chris
Comment 21 Dominik Bartkiewicz 2018-05-10 03:39:42 MDT
Hi

That's great.
We're gonna go ahead and mark this as resolved/fixed.
Enjoy your vacation.

Dominik