Created attachment 7228 [details] Main slurm.conf We recently upgraded Slurm toward the version 17.11.7 to cover the security issues. However since this upgrade, any attempt to allocate more memory per cpu than the standard raise an error: $> srun -p interactive -N 1 --mem-per-cpu=8G --pty bash srun: error: Unable to allocate resources: Requested partition configuration not available now (revealed also in the logs of the slurmctld daemon: [2018-07-04T12:03:43.539] _slurm_rpc_allocate_resources: Requested partition configuration not available now Note that using '--mem' seems to work. I attach the main configuration files. It's probably linked to the fact that this new release seems to enforce the maximum amount of memory per cpu: $> scontrol show config | grep -i mempercpu DefMemPerCPU = 4096 MaxMemPerCPU = 4196 Any advice to correct this problem ?
Hi Sebastien, This definitely is a duplicate of bug 5240. Historically when a job requested more memory than the configured MaxMemPer* limit, Slurm was doing automatic adjustments to try to make the job request fit the limits, including "increasing cpus_per_task and decreasing mem_per_cpu by factor of X based upon mem_per_cpu limits" or "Setting job's pn_min_cpus to Y due to memory limit" I (and some other people) personally don't like to modify what the user requested and if memory exceeded the limit, I preferred to get the job rejected (based upon EnforcePartLimits value at submit time) or left it pending with reason MaxMemPerLimit. Problem is this change in behavior should had been added in the master branch and documented, instead of check it in 17.11.7 were I unfortunately and incorrectly decided to land the commit bf4cb0b1b01f3e165bf. In bug 5240 comment 24 we've decided to revert such change in commit d52d8f4f0ce1a5b86bb0691630da0dc3dace1683 and we added this commit on top of the revert: f07f53fc138b22485e7c26903968fa470cc9d98f to fix a problem on multi-partition requests. They will be in 17.11.8 and onwards, but can be both applied at your earliest convenience. Appending ".patch" to the GitHub commit URL will generate a patch formatted document available to be applied if needed. Please, let me know if you have further questions. Thanks. *** This ticket has been marked as a duplicate of ticket 5240 ***
Dear Alejandro, Many thanks for the explanation. May I still suggest to adapt the error message in this context as 'Requested partition configuration not available now' does not seems fully appropriate in this case.
(In reply to Sebastien Varrette from comment #2) > Dear Alejandro, > > Many thanks for the explanation. > > May I still suggest to adapt the error message in this context as 'Requested > partition configuration not available now' does not seems fully appropriate > in this case. With the two commits I suggested before and included since .8 the error code should be more concise: alex@ibiza:~/t$ sbatch --mem-per-cpu=860 --wrap "sleep 9999" sbatch: error: Batch job submission failed: Memory required by task is not available alex@ibiza:~/t$