We have two kind of nodes, snb = 16 cores and hsw = 24 cores. After Slurm update from 14.11 to 15.08.8 (92ac0dcdbb78df968962acc5d006e1f3aeb6eb37) altering constraints will result a badcontraints state. Slurm sets NumNodes so that job can run only nodes which has 24 cores. srun -C "hsw|snb" --mem-per-cpu=12000 -n 320 -p parallel --pty $SHELL NumNodes=21 NumCPUs=320 CPUs/Task=1 ReqB:S:C:T=0:0:*:* TRES=cpu=320,mem=3840000,node=1 Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=* scontrol update JobId=8500942 Features=snb scontrol show job 8500942 JobId=8500942 JobName=bash JobState=PENDING Reason=BadConstraints Dependency=(null) Requeue=1 Restarts=0 BatchFlag=0 Reboot=0 ExitCode=0:0 DerivedExitCode=0:0 RunTime=00:00:00 TimeLimit=00:05:00 TimeMin=N/A SubmitTime=2016-02-23T14:34:44 EligibleTime=2016-02-23T14:34:44 StartTime=2016-02-23T14:35:15 EndTime=2016-02-23T14:35:15 PreemptTime=None SuspendTime=None SecsPreSuspend=0 Partition=parallel AllocNode:Sid=taito-login3:59995 ReqNodeList=(null) ExcNodeList=(null) NodeList=(null) SchedNodeList=c[609-615,620-624,627-635,637-638] NumNodes=21 NumCPUs=320 CPUs/Task=1 ReqB:S:C:T=0:0:*:* TRES=cpu=320,mem=3840000,node=1 Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=* MinCPUsNode=320 MinMemoryCPU=12000M MinTmpDiskNode=0 Features=snb Gres=(null) Reservation=(null) Shared=OK Contiguous=0 Licenses=(null) Network=(null) Command=/bin/bash WorkDir=/homeappl/home/ttervo Power= SICP=0
*** Ticket 2478 has been marked as a duplicate of this ticket. ***
We're working on this, it's a regression in Slurm 15.08.6 and later that has shown up on a few different bugs this week. Any update for a job without a -N (nodes) count set incorrectly results in MinCPUsNode=(total-number-of-cpus). In your example below it was set to 320, and I assume you don't have any nodes with that many CPUs available.
They Tommi, this is fixed in commits bd9fa8300b1 and de28c13a159d. Please reopen if they don't fix the problem for you.