Ticket 5283 - Behaviour of --mem-per-cpu with MaxMemPerCPU
Summary: Behaviour of --mem-per-cpu with MaxMemPerCPU
Status: RESOLVED DUPLICATE of ticket 5240
Alias: None
Product: Slurm
Classification: Unclassified
Component: Scheduling (show other tickets)
Version: 17.11.7
Hardware: Linux Linux
: --- 4 - Minor Issue
Assignee: Director of Support
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2018-06-08 06:04 MDT by Ciaron Linstead
Modified: 2018-06-08 06:08 MDT (History)
1 user (show)

See Also:
Site: PIK
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments
slurm.conf (3.44 KB, text/plain)
2018-06-08 06:04 MDT, Ciaron Linstead
Details

Note You need to log in before you can comment on or make changes to this ticket.
Description Ciaron Linstead 2018-06-08 06:04:11 MDT
Created attachment 7052 [details]
slurm.conf

On 17.11.2 (and earlier), a typical use case for us was to set "#SBATCH --mem-per-cpu=32000" when submitting single-task jobs to get an allocation of 32GB RAM per task. (Our nodes have 16 CPU cores and 64GB RAM.)

Since upgrading to 17.11.7, setting --mem-per-cpu to anything over 3500 results in the job waiting with reason "(MaxMemPerLimit)"

Excerpt from "scontrol show config"

DefMemPerCPU            = 3500
MaxMemPerCPU            = 3500
MemLimitEnforce         = Yes
SelectTypeParameters    = CR_CPU_MEMORY

Def- and MaxMemPerCPU are both set to roughly 1/16th of the node memory to avoid overbooking memory on the nodes.

The slurm.conf documentation (under "MaxMemPerCPU") states

"NOTE: If a job specifies a memory per CPU limit that exceeds this system limit, that job's count of CPUs per task will automatically be increased..."

My questions:

1) My reading of the documentation is that MaxMemPerCPU can be overridden, as we were doing before. Is this no longer the behaviour since 17.11.7? (Our workaround is to use --mem instead of --mem-per-cpu.)

2) If we set MaxMemPerCPU=(node real memory), could the node memory be overbooked?


"scontrol show job" for a waiting job:

JobId=10017944 JobName=getting-started
   UserId=linstead(405) GroupId=users(100) MCS_label=N/A
   Priority=20946 Nice=0 Account=its QOS=short
   JobState=PENDING Reason=MaxMemPerLimit Dependency=(null)
   Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
   RunTime=00:00:00 TimeLimit=1-00:00:00 TimeMin=N/A
   SubmitTime=2018-06-08T13:49:35 EligibleTime=2018-06-08T13:49:35
   StartTime=Unknown EndTime=Unknown Deadline=N/A
   PreemptTime=None SuspendTime=None SecsPreSuspend=0
   LastSchedEval=2018-06-08T13:49:35
   Partition=standard AllocNode:Sid=login01:42004
   ReqNodeList=(null) ExcNodeList=(null)
   NodeList=(null)
   NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
   TRES=cpu=1,mem=3501M,node=1
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
   MinCPUsNode=1 MinMemoryCPU=3501M MinTmpDiskNode=0
   Features=(null) DelayBoot=00:00:00
   Gres=(null) Reservation=(null)
   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
   Command=/home/linstead/cluster-examples/getting-started/slurm.sh
   WorkDir=/home/linstead/cluster-examples/getting-started
   StdErr=/home/linstead/cluster-examples/getting-started/myjob-10017944.out
   StdIn=/dev/null
   StdOut=/home/linstead/cluster-examples/getting-started/myjob-10017944.out
   Power=

Typical node configuration:

NodeName=cs-e14c01b01 Arch=x86_64 CoresPerSocket=8
   CPUAlloc=16 CPUErr=0 CPUTot=16 CPULoad=13.24
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=cs-e14c01b01 NodeHostName=cs-e14c01b01 Version=17.11
   OS=Linux 4.4.103-92.56-default #1 SMP Wed Dec 27 16:24:31 UTC 2017 (2fd2155) 
   RealMemory=60000 AllocMem=56000 FreeMem=28551 Sockets=2 Boards=1
   State=ALLOCATED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=standard 
   BootTime=2018-01-17T10:58:46 SlurmdStartTime=2018-06-07T09:40:49
   CfgTRES=cpu=16,mem=60000M,billing=16
   AllocTRES=cpu=16,mem=56000M
   CapWatts=n/a
   CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
Comment 1 Alejandro Sanchez 2018-06-08 06:08:51 MDT
We're working on restoring the previous behavior in bug 5240. I'm marking this as a duplicate of that one.

*** This ticket has been marked as a duplicate of ticket 5240 ***