Ticket 5897 - srun: Warning: can't honor --ntasks-per-node
Summary: srun: Warning: can't honor --ntasks-per-node
Status: RESOLVED INFOGIVEN
Alias: None
Product: Slurm
Classification: Unclassified
Component: User Commands (show other tickets)
Version: 17.11.8
Hardware: Linux Linux
: --- 4 - Minor Issue
Assignee: Director of Support
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2018-10-22 01:53 MDT by Thomas Klar
Modified: 2019-05-24 13:45 MDT (History)
4 users (show)

See Also:
Site: Atos/Eviden Sites
Alineos Sites: ---
Atos/Eviden Sites: SURFsara
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description Thomas Klar 2018-10-22 01:53:39 MDT
When submitting a job with:

#SBATCH -n 720
#SBATCH --exclusive
#SBATCH --ntasks-per-node=24

This warning is printed:

srun: Warning: can't honor --ntasks-per-node set to 24 which doesn't match the requested tasks 30 with the number of requested nodes 30. Ignoring --ntasks-per-node.

This warning looks erroneous to me. First of all, it is not true, the job ends up using 24 threads per node, which is the correct core count.

When checking the source code, slurm/src/srun/libsrun/opt.c , I find:



			if (opt.ntasks > opt.ntasks_per_node)
					info("Warning: can't honor --ntasks-per-node "
					     "set to %u which doesn't match the "
					     "requested tasks %u with the number of "
					     "requested nodes %u. Ignoring "
					     "--ntasks-per-node.", opt.ntasks_per_node,
					     opt.ntasks, opt.min_nodes);
				opt.ntasks_per_node = NO_VAL;
	


As the output shows, opt.ntasks is 30, while the opt.ntasks_per_node is 24. How the opt.ntasks variable is set, is not this clear, but it is apparently set to the number of nodes:


			if (((opt.distribution & SLURM_DIST_STATE_BASE) ==
				     SLURM_DIST_ARBITRARY) && !opt.ntasks_set) {
					opt.ntasks = hostlist_count(hl);
					opt.ntasks_set = true;
				}
	


This then makes this if essentially: “if more nodes than threads per node”. This is of course stupid and does not correspond to the warning text.

This confirms, that the warning has no bearing on the number of tasks actually started per node. So this bug is nothing but an erroneous printing of a warning, it does not affect functionality.
Comment 1 Michael Hinton 2018-10-22 17:59:19 MDT
Hi Thomas,

Are those the only arguments being passed to sbatch? Do you have an environmental variable `SLURM_HOSTFILE` defined? Could you supply a "-v" next time this is run, so we can see what the input parameters are? 

From https://slurm.schedmd.com/sbatch.html under `--ntasks-per-node`: “If used with the --ntasks option, the --ntasks option will take precedence and the --ntasks-per-node will be treated as a maximum count of tasks per node. Meant to be used with the --nodes option.”

Also, under `-N, --nodes`: “If -N is not specified, the default behavior is to allocate enough nodes to satisfy the requirements of the -n and -c options”

Is there a reason you are specifying --ntasks-per-node=24 in the first place? It seems unnecessary, because each node has a max of 24 nodes anyways.

> As the output shows, opt.ntasks is 30, while the opt.ntasks_per_node is 24.
> How the opt.ntasks variable is set, is not this clear, but it is apparently
> set to the number of nodes:
> 
> 	if (((opt.distribution & SLURM_DIST_STATE_BASE) ==
> 	    SLURM_DIST_ARBITRARY) && !opt.ntasks_set) {
> 		opt.ntasks = hostlist_count(hl);
> 		opt.ntasks_set = true;
> 	}
opt.ntasks can't be set here, or else you would have never seen the warning message, which was in the corresponding else statement. Plus, this is only if you specified `-m arbitrary`, which seems unlikely. I think it's taking a different code path.

These are just my first thoughts; I’ll keep looking into it.

-Michael
Comment 2 Thomas Klar 2018-10-23 01:50:12 MDT
Hello Michael,

There is no hostfile or N specified. The other options are name, time and partition, none of which should have any bearing on this issue.

Even if the ntasks-per-node option was superfluous in this scenario, the warning is still wrong. On the one hand, 30*30 is not 720 and the number of tasks eventually started per node is still 24.

I'm not sure where the opt.ntasks gets set, but it does end up with the number of nodes.

Thanks for looking into this.

Thomas
Comment 4 Michael Hinton 2018-10-23 18:16:02 MDT
I can't see how opt.ntasks can be set to 30 if its already set to 720 via `-n 720`. If it isn't already set, then it can only be set if SBATCH_DISTRIBUTION=arbitrary or `-m arbitrary` is specified and either SLURM_HOSTFILE or `-w` is specified with a valid nodelist. Very strange.

Could you please show the whole set of commands/ the entire sbatch file being run that produces this error? Something isn't adding up. Thanks.
Comment 5 Michael Hinton 2018-10-24 09:42:15 MDT
Obviously, I can see that the warning is wrong in this instance, but I want to dig deeper to find the underlying cause and to reproduce the issue. Simply hiding the warning message because it doesn't make sense doesn't really help anybody.
Comment 6 S Senator 2018-12-19 11:46:24 MST
We have a similar use case exhibiting this behavior.

Specifically, when the sbatch script specifies:
#SBATCH --tasks-per-node=36
#SBATCH -N 4

and the individual sruns read:
 srun --ntasks-per-node=1 --overcommit --cpu_bind=none ...

previously, we would get one srun per node.
Presently, we see 36 sruns per node.
Our present work-around is to change all of these scripts to:
  srun --ntasks=<# hosts#> --overcommit --distribution=cyclic --cpu_bind=none ...

This change was introduced between slurm versions 17.02 and 17.11.
Comment 8 Michael Hinton 2019-01-25 17:31:20 MST
Hey Thomas, sorry for taking so long on this.

What you describe seems like a bug, but I haven't been able to reproduce it. Could you give me some more information about your system, like the slurm.conf and the output of `scontrol show nodes <relevant_node>`?

Does S Senator's comment help you?

Thanks,
Michael
Comment 9 Michael Hinton 2019-03-07 10:17:36 MST
Feel free to reopen if you want to continue looking into this.

Thanks!
Michael