Ticket 17571 - srun option --cpu-bind=v differs from =verbose
Summary: srun option --cpu-bind=v differs from =verbose
Status: RESOLVED FIXED
Alias: None
Product: Slurm
Classification: Unclassified
Component: User Commands (show other tickets)
Version: 23.02.3
Hardware: Linux Linux
: --- 4 - Minor Issue
Assignee: Benjamin Witham
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2023-08-30 11:53 MDT by Nils Kanning
Modified: 2023-09-21 16:09 MDT (History)
2 users (show)

See Also:
Site: DLR
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed: 23.02.6
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description Nils Kanning 2023-08-30 11:53:21 MDT
Hi,

On our SMT-enabled system, the srun options --cpu-bind=v and --cpu-bind=verbose result in different CPU bindings:

$ srun -v --cpu-bind=verbose --tasks-per-node=1 --cpus-per-task=4 --threads-per-core=1 hostname 
[...]
srun: CpuBindType=verbose,threads
cpu-bind=MASK - n0944, task  0  0 [344427]: mask 0xf set

$ srun -v --cpu-bind=v --tasks-per-node=1 --cpus-per-task=4 --threads-per-core=1 hostname 
[...]
srun: CpuBindType=verbose
cpu-bind=MASK - n0944, task  0  0 [344509]: mask 0xf0000000000000000000000000000000f set

The expected behavior for both options is that of --cpu-bind=verbose. We were able to reproduce the problem on two clusters running Slurm 23.02. Another system with version 21.08 behaves as expected.

We suspect that the issue might be related to commit https://github.com/SchedMD/slurm/commit/40a3bf3 which introduced the following change in slurm_opt.c: 

-	   (opt->srun_opt->cpu_bind_type == CPU_BIND_VERBOSE)) {
+		   !xstrcmp(opt->srun_opt->cpu_bind, "verbose")) {

This would be consistent with our observation, that choosing e.g. --cpu-bind=VeRbOsE leads to the same result as --cpu-bind=v.

Best regards,
Nils
Comment 1 Benjamin Witham 2023-09-01 15:28:41 MDT
Hello Nils, 

I can reproduce this, and I'm looking into it. I'll keep you updated.

Thank you, 

Benjamin Witham
Comment 4 Benjamin Witham 2023-09-19 15:37:35 MDT
Hello Nils,

This issue was caused by a regression introduced in commit 40a3bf3712 and it has been has been fixed in commit 76716bd80f. It will be applied to the 23.02.6 release. I'll close this ticket now unless you have any other questions.