With Slurm 20.11: > $ salloc srun grep Cpus_allowed_list /proc/self/status > Cpus_allowed_list: 0,40 > $ srun grep Cpus_allowed_list /proc/self/status > Cpus_allowed_list: 0-79 In the first case, the srun is getting the new default behavior of '--exclusive', as discussed in bug#10383. But in the second case, it isn't. I wonder if this is because the '--exclusive' flag is overloaded - it means something different for sbatch/salloc than it does for srun.
This is probably only a relevant issue for partitions with OverSubscribe=EXCLUSIVE.
Hi Luke, (In reply to Luke Yeager from comment #0) > With Slurm 20.11: > > $ salloc srun grep Cpus_allowed_list /proc/self/status > > Cpus_allowed_list: 0,40 > > $ srun grep Cpus_allowed_list /proc/self/status > > Cpus_allowed_list: 0-79 > In the first case, the srun is getting the new default behavior of > '--exclusive', as discussed in bug#10383. But in the second case, it isn't. > > I wonder if this is because the '--exclusive' flag is overloaded - it means > something different for sbatch/salloc than it does for srun. Regarding the second case: It's because --exclusive/OverSubscribe=exclusive mean something different when srun itself is *making the allocation* vs. when srun is running *within an existing allocation* created by salloc/sbatch. When srun is by itself on the command line, it makes an implicit allocation, but then only one step is subsequently made (# of steps match the # of `srun` invocations). So it makes sense that the sole step created gets everything in the allocation by default, because there is no other step that it can share with. (So in effect it seems as if --whole is implied here.) If the step wasn't given everything in the allocation, how would you indicate what the step *should* get? So I think this isn't a bug, and is behaving how we want it to behave. Thanks, -Michael
(In reply to Michael Hinton from comment #2) > Regarding the second case: It's because --exclusive/OverSubscribe=exclusive > mean something different when srun itself is *making the allocation* vs. > when srun is running *within an existing allocation* created by > salloc/sbatch. Yes, I understand this. I think the choice of nomenclature is unfortunate. > So I think this isn't a bug, and is behaving how we want it to behave. I would assign a pretty high astonishment factor to this discrepancy. I understood pretty quickly what was going on because I've been in the weeds with this exclusive/whole/overlap stuff this week, and because I immediately looked at how many cores I had access to. But there are bound to be some users who can't figure out why their application runs slower with 'salloc srun' vs. with 'srun'. There will also be users confused why this hangs (because it didn't hang for them on 20.02): >(login_node) $ srun --pty bash >(within_allocation)$ srun hostname It's going to take them a while to figure out they need the --overlap flag for the second srun. (I realize this isn't related to the current bug report, but I'm trying to drive the point home that the changes being discussed in bug#10383 have some pretty far-reaching and surprising implications) If all this is indeed intended behavior, then I guess I'll close this as INFOGIVEN.
(In reply to Luke Yeager from comment #3) > (In reply to Michael Hinton from comment #2) > > So I think this isn't a bug, and is behaving how we want it to behave. > I would assign a pretty high astonishment factor to this discrepancy. I > understood pretty quickly what was going on because I've been in the weeds > with this exclusive/whole/overlap stuff this week, and because I immediately > looked at how many cores I had access to. But there are bound to be some > users who can't figure out why their application runs slower with 'salloc > srun' vs. with 'srun'. > > There will also be users confused why this hangs (because it didn't hang for > them on 20.02): > > >(login_node) $ srun --pty bash > >(within_allocation)$ srun hostname > It's going to take them a while to figure out they need the --overlap flag > for the second srun. Here is what we recommend, and here is what I think you are looking for: Set this in your slurm.conf (new to 20.11): LaunchParameters=use_interactive_step Then educate your users to just use `salloc` instead of `srun --pty bash` or `salloc srun --pty bash`. What `use_interactive_step` does is make it so salloc creates an "interactive step" for the pty shell on a node in the allocation. This step is analogous to the batch step used by sbatch, so you can think of it like an interactive batch script. Just like a batch step, the interactive step has access to the entire allocation, but does NOT block regular srun steps (i.e. it's like it has an implicit --overlap). I think this is similar to what people expect: (login_node) $ salloc --exclusive salloc: Granted job allocation 438 (within_allocation)$ grep Cpus_allowed_list /proc/self/status Cpus_allowed_list: 0-5 (within_allocation)$ srun grep Cpus_allowed_list /proc/self/status Cpus_allowed_list: 0 (within_allocation)$ srun --whole grep Cpus_allowed_list /proc/self/status Cpus_allowed_list: 0-5 See https://slurm.schedmd.com/faq.html#prompt and https://slurm.schedmd.com/slurm.conf.html#OPT_use_interactive_step. Note that LaunchParameters=use_interactive_step replaces DefaultSallocCommand. Hopefully that helps, -Michael
That doesn't really address my overall concerns about the new behavior - it just weakens the trivial example used to create this bug. Nonetheless, that's a neat new flag that I missed - thanks for sharing! I like making salloc behave more like sbatch.