Bug 7191

Summary: Bring cpu_bind back.
Product: Slurm Reporter: Levi Morrison <levi_morrison>
Component: User CommandsAssignee: Tim Wickberg <tim>
Status: RESOLVED FIXED QA Contact: Brian Christiansen <brian>
Severity: 3 - Medium Impact    
Priority: --- CC: griznog, hpc-admin, kaizaad, kilian, ryan_cox, sts
Version: 19.05.x   
Hardware: Linux   
OS: Linux   
See Also: https://bugs.schedmd.com/show_bug.cgi?id=7311
Site: BYU - Brigham Young University Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: 19.05.1 Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---
Attachments: Add --cpu_bind as an alias for --cpu-bind.

Description Levi Morrison 2019-06-06 11:50:15 MDT
As mentioned on the mailing list, this breaks essentially all OpenMPI releases on Slurm 19.05:

https://groups.google.com/d/msg/slurm-users/A0mKkveOT08/eM6bruJiBAAJ
Comment 1 Kilian Cavalotti 2019-06-06 11:54:19 MDT
Yes, please.
Comment 2 John Hanks 2019-06-06 11:58:37 MDT
+1 to revert this change.
Comment 3 Kaizaad 2019-06-06 13:13:37 MDT
+1
-k
Comment 6 hpc-admin 2019-06-07 02:41:02 MDT
This will impact our site as well. What was the rationale behind this decision?
Comment 7 Tim Wickberg 2019-06-07 06:02:39 MDT
Created attachment 10536 [details]
Add --cpu_bind as an alias for --cpu-bind.

The attached patch adds --cpu_bind back in as an alias of --cpu-bind for salloc/sbatch/srun.

Some variant of this patch - albeit with a warning message added in to note that --cpu-bind is the correct spelling - will be in 19.05.1 when released, and supported through the 19.05 release cycle.

As a few people have noted on the mailing list, we do encourage sites to use srun - ideally with PMI2 or PMIx - instead of mpiexec/mpirun, but we will work with OpenMPI to get this updated in their wrapper scripting.

- Tim
Comment 8 Levi Morrison 2019-06-07 08:07:27 MDT
As noted on the mailing list, this is already fixed in OpenMPI's master branch. Hopefully it can get backported and new releases made.

Thanks for the patch -- going to test this out now.
Comment 12 Tim Wickberg 2019-07-09 16:05:04 MDT
I've added --cpu_bind back in as an alias, this will be included with 19.05.1 when released, and is functionally identical to the patch attached here.

commit d5cf91857e2c14ebd7c70c2eb2d71af43945a297
Author:     Tim Wickberg <tim@schedmd.com>
AuthorDate: Tue Jul 9 13:08:53 2019 -0600

    srun - restore --cpu_bind as an alias for --cpu-bind.
    
    Needed to maintain compatibility with OpenMPI's mpirun/mpiexec
    launch commands.
    
    Bug 7191.