Ticket 3079 - ntasks-per-node error
Summary: ntasks-per-node error
Status: RESOLVED TIMEDOUT
Alias: None
Product: Slurm
Classification: Unclassified
Component: User Commands (show other tickets)
Version: 16.05.3
Hardware: Linux Linux
: --- 3 - Medium Impact
Assignee: Alejandro Sanchez
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2016-09-13 08:55 MDT by Martins Innus
Modified: 2020-03-25 16:38 MDT (History)
2 users (show)

See Also:
Site: University of Buffalo (SUNY)
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments
batch script (1.08 KB, application/x-shellscript)
2016-09-14 08:47 MDT, Martins Innus
Details
script2 (1.15 KB, application/x-shellscript)
2016-09-14 11:42 MDT, Martins Innus
Details
output2 (2.09 KB, text/plain)
2016-09-14 11:44 MDT, Martins Innus
Details
error2 (257 bytes, text/plain)
2016-09-14 11:45 MDT, Martins Innus
Details

Note You need to log in before you can comment on or make changes to this ticket.
Description Martins Innus 2016-09-13 08:55:13 MDT
Hi, We are seeing this error with the following batch script:

srun: Warning: can't honor --ntasks-per-node set to 252 which doesn't match the requested tasks 252 with the number of requested nodes 21.  Ignoring --ntasks-per-node.
 
#SBATCH --nodes=21
#SBATCH --ntasks-per-node=12
#SBATCH --constraint=CPU-E5645
#SBATCH --mem=48000

The nodes with that constraint do have 12 cores.

Searching around I think this is the same issue as here:

https://bugs.schedmd.com/show_bug.cgi?format=multiple&id=3032

and

https://groups.google.com/forum/#!msg/slurm-devel/zeuBOXcPJUM/qZ5wCPBYCAAJ

I saw the following note in the 16.05.4 release notes, but it looks to be a slightly different problem, so I wanted to check if that would fix this issue before we updated:

####
-- Correct documented configurations where --ntasks-per-core and 
--ntasks-per-socket are supported. 
####

Thanks for any insight.

Martins
Comment 1 Alejandro Sanchez 2016-09-14 05:39:55 MDT
Martin, could you please show the whole batch script including the possible srun requests inside it. Since it's an srun error it will be easier to reproduce. Anyhow, we are able to reproduce something similar with a more simple request:

$ salloc --ntasks-per-node=8 -n 8
salloc: Granted job allocation 20004
srun: Warning: can't honor --ntasks-per-node set to 8 which doesn't match the requested tasks 1 with the number of requested nodes 1.  Ignoring --ntasks-per-node.

$ scontrol show config | grep Salloc
SallocDefaultCommand    = srun -n1 -N1 --mem-per-cpu=0 --pty --preserve-env --gres=craynetwork:0 --mpi=none $SHELL

So there's definitely an issue going on there.
Comment 5 Martins Innus 2016-09-14 08:47:17 MDT
Created attachment 3495 [details]
batch script
Comment 6 Martins Innus 2016-09-14 08:48:34 MDT
OK, attached.  This is a much simplified script from the original report but still shows the same problem.  Just running mpi hello world.
Comment 7 Alejandro Sanchez 2016-09-14 08:59:34 MDT
I see in your script you have:

#SBATCH --nodes=2
#SBATCH --ntasks-per-node=12

and then

NPROCS=`expr $SLURM_NTASKS_PER_NODE \* $SLURM_NNODES`
srun --ntasks-per-node=$NPROCS ./helloworld

why are you overriding --ntasks-per-node from 12 to 12*2 ? Maybe I'm wrong but it sounds a bit strange to me.
Comment 14 Martins Innus 2016-09-14 11:42:57 MDT
Created attachment 3499 [details]
script2
Comment 15 Martins Innus 2016-09-14 11:44:47 MDT
Created attachment 3500 [details]
output2
Comment 16 Martins Innus 2016-09-14 11:45:06 MDT
Created attachment 3501 [details]
error2
Comment 17 Martins Innus 2016-09-14 11:55:30 MDT
OK, sorry.  In attempting to simplify the script, there was an error.

I uploaded a new script and the corresponding output and error.

The easiest way to reproduce the error is to have one part of the job script that has an srun invocation that uses fewer cores than the overall job script.

We think we are having this same error with other combinations of nodes/cores, but this was the easiest to simplify.

Let me know if it is actually an error in the job script and we can work with the user to fix it.

Thanks

Martins
Comment 18 Alejandro Sanchez 2016-09-14 11:57:52 MDT
(In reply to Martins Innus from comment #17)
> OK, sorry.  In attempting to simplify the script, there was an error.
> 
> I uploaded a new script and the corresponding output and error.
> 
> The easiest way to reproduce the error is to have one part of the job script
> that has an srun invocation that uses fewer cores than the overall job
> script.

Yes we also managed to locally reproduce this way.

> We think we are having this same error with other combinations of
> nodes/cores, but this was the easiest to simplify.
> 
> Let me know if it is actually an error in the job script and we can work
> with the user to fix it.
> 

We've ready a patch for this, once it is pushed we'll come back to you.
Comment 31 Alejandro Sanchez 2016-09-15 11:58:22 MDT
Martins, following commit silences the warning you see if the number of tasks is less than the number of tasks per node given/inherited:

https://github.com/SchedMD/slurm/commit/daacf5afee9

Anyhow, as the documentation states, ntasks-per-node is "Meant to be used with the --ntasks option". Slurm has to figure out how many tasks can run in an allocation based on what the allocation requests.  This is done off whatever is given Slurm. Slurm always wants to fill in an allocation so ntasks is ALWAYS inherited from the environment when in one. So any time you are in an allocation you will ALWAYS default to whatever the allocation has for tasks. If you expect a certain number of tasks you should ask for it. The options you are specified are only telling Slurm how to lay out tasks, not the number. Slurm will default to fill the allocated resources unless told otherwise.
Comment 32 Martins Innus 2016-09-15 12:12:30 MDT
Alejandro,
  OK.  So if we have a multistep job where we have multiple sruns that require different ntasks-per-node we need to use --ntasks for the srun?  Like this:


#SBATCH --nodes=2
#SBATCH --ntasks-per-node=12
#SBATCH --ntasks=24

# This inherits
srun ./foo.exe

# This needs all new params
srun --nodes=2 --ntasks=12 --ntasks-per-node=6 ./bar.exe

# End sbatch


Thanks for the clarification.

Martins
Comment 33 Alejandro Sanchez 2016-09-16 01:47:55 MDT
(In reply to Martins Innus from comment #32)
> #SBATCH --nodes=2
> #SBATCH --ntasks-per-node=12
> #SBATCH --ntasks=24
> 
> # This inherits
> srun ./foo.exe

Slurm will default to fill the allocated resources unless told otherwise, so this first srun will try to fill 24 tasks across the 2 nodes.

> 
> # This needs all new params
> srun --nodes=2 --ntasks=12 --ntasks-per-node=6 ./bar.exe

If you want to consume less than the allocated for the job you've to explicitly tell Slurm like you do in this second srun, exactly.

> # End sbatch
> 
> 
> Thanks for the clarification.
> 
> Martins

No problem. If you don't have any more questions let me know if we can close the bug. Thanks.
Comment 34 Martins Innus 2016-09-16 05:37:59 MDT
OK thanks!  I will ask the researcher to resubmit his job and confirm it works.
Comment 35 Alejandro Sanchez 2016-10-04 03:24:12 MDT
Hi Martins, any progress with this? Thanks.
Comment 36 Alejandro Sanchez 2016-10-31 07:41:00 MDT
Marking as resolved/timedout. Please reopen if any issue is encountered with the customer feedback.