sbatch --time=30 -N 16384 -p plustre -o mpi-hw.16384.1.o%j --immediate mpi-hw.16384.1 Submitted batch job 2253 I filled up the plustre queue so that JOBID PARTITION NAME USER ST TIME NODES MIDPLANELIST(REASON) 2252 plustre mpi-hw.16384.1 swltest PD 0:00 16K (Resources) 2253 plustre mpi-hw.16384.1 swltest PD 0:00 16K (Priority) The previous job (2252) is the Resources job so 2253 could not run.
This appears to be a bug on all systems and has nothing to do with the bluegene plugin I set the options correctly to reflect it.
This at first glance this looked like an easy issue to resolve, but after further investigation this is not as straight forward as you would think. Would it be sufficient to remove the immediate option from sbatch and the documentation rather than actually fix it? Let us know. It appears this has never worked.
If the immediate option is broken and would take effort to repair, I'm ok with eliminating it from salloc/sbatch/srun and purging all references to it from the man pages and html pages. We thought of it as a solution for a user who wanted to submit a large job, run it if it could, but not add it to the queue if it would start to starve out smaller jobs.
The immediate option is only broken with sbatch. sbatch is handled differently than salloc or srun. We are only proposing to remove the option from sbatch. We don't even think it makes since to have it there. The user could use srun or salloc with the immediate option to get the same effect, the immediate option works just fine for those. We will remove the option in 2.4 for sbatch. Inform the user to use srun or salloc to accomplish the immediate request.
I have added some support for this option, but also noted in the sbatch man page that the option has limitted support. See commit: https://github.com/SchedMD/slurm/commit/e063642d555d938b07a2c4ac9fcde9a4c66dd1fb