5283 – Behaviour of --mem-per-cpu with MaxMemPerCPU

Ticket 5283 - Behaviour of --mem-per-cpu with MaxMemPerCPU

Summary: Behaviour of --mem-per-cpu with MaxMemPerCPU

Status:	RESOLVED DUPLICATE of ticket 5240

Alias:	None

Product:	Slurm
Classification:	Unclassified
Component:	Scheduling (show other tickets)
Version:	17.11.7
Hardware:	Linux Linux

Importance:	--- 4 - Minor Issue
Assignee:	Director of Support
QA Contact:

URL:

Depends on:
Blocks:

Reported:	2018-06-08 06:04 MDT by Ciaron Linstead
Modified:	2018-06-08 06:08 MDT (History)
CC List:	1 user (show)

See Also:
Site:	PIK
Alineos Sites:	---
Atos/Eviden Sites:	---
Confidential Site:	---
Coreweave sites:	---
Cray Sites:	---
DS9 clusters:	---
HPCnow Sites:	---
HPE Sites:	---
IBM Sites:	---
NOAA SIte:	---
OCF Sites:	---
Recursion Pharma Sites:	---
SFW Sites:	---
SNIC sites:	---
Linux Distro:	---
Machine Name:
CLE Version:
Version Fixed:
Target Release:	---
DevPrio:	---
Emory-Cloud Sites:	---

Attachments
slurm.conf (3.44 KB, text/plain) 2018-06-08 06:04 MDT, Ciaron Linstead	Details
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this ticket.

Description Ciaron Linstead 2018-06-08 06:04:11 MDT

Created attachment 7052 [details]
slurm.conf

On 17.11.2 (and earlier), a typical use case for us was to set "#SBATCH --mem-per-cpu=32000" when submitting single-task jobs to get an allocation of 32GB RAM per task. (Our nodes have 16 CPU cores and 64GB RAM.)

Since upgrading to 17.11.7, setting --mem-per-cpu to anything over 3500 results in the job waiting with reason "(MaxMemPerLimit)"

Excerpt from "scontrol show config"

DefMemPerCPU            = 3500
MaxMemPerCPU            = 3500
MemLimitEnforce         = Yes
SelectTypeParameters    = CR_CPU_MEMORY

Def- and MaxMemPerCPU are both set to roughly 1/16th of the node memory to avoid overbooking memory on the nodes.

The slurm.conf documentation (under "MaxMemPerCPU") states

"NOTE: If a job specifies a memory per CPU limit that exceeds this system limit, that job's count of CPUs per task will automatically be increased..."

My questions:

1) My reading of the documentation is that MaxMemPerCPU can be overridden, as we were doing before. Is this no longer the behaviour since 17.11.7? (Our workaround is to use --mem instead of --mem-per-cpu.)

2) If we set MaxMemPerCPU=(node real memory), could the node memory be overbooked?


"scontrol show job" for a waiting job:

JobId=10017944 JobName=getting-started
   UserId=linstead(405) GroupId=users(100) MCS_label=N/A
   Priority=20946 Nice=0 Account=its QOS=short
   JobState=PENDING Reason=MaxMemPerLimit Dependency=(null)
   Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
   RunTime=00:00:00 TimeLimit=1-00:00:00 TimeMin=N/A
   SubmitTime=2018-06-08T13:49:35 EligibleTime=2018-06-08T13:49:35
   StartTime=Unknown EndTime=Unknown Deadline=N/A
   PreemptTime=None SuspendTime=None SecsPreSuspend=0
   LastSchedEval=2018-06-08T13:49:35
   Partition=standard AllocNode:Sid=login01:42004
   ReqNodeList=(null) ExcNodeList=(null)
   NodeList=(null)
   NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
   TRES=cpu=1,mem=3501M,node=1
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
   MinCPUsNode=1 MinMemoryCPU=3501M MinTmpDiskNode=0
   Features=(null) DelayBoot=00:00:00
   Gres=(null) Reservation=(null)
   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
   Command=/home/linstead/cluster-examples/getting-started/slurm.sh
   WorkDir=/home/linstead/cluster-examples/getting-started
   StdErr=/home/linstead/cluster-examples/getting-started/myjob-10017944.out
   StdIn=/dev/null
   StdOut=/home/linstead/cluster-examples/getting-started/myjob-10017944.out
   Power=

Typical node configuration:

NodeName=cs-e14c01b01 Arch=x86_64 CoresPerSocket=8
   CPUAlloc=16 CPUErr=0 CPUTot=16 CPULoad=13.24
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=cs-e14c01b01 NodeHostName=cs-e14c01b01 Version=17.11
   OS=Linux 4.4.103-92.56-default #1 SMP Wed Dec 27 16:24:31 UTC 2017 (2fd2155) 
   RealMemory=60000 AllocMem=56000 FreeMem=28551 Sockets=2 Boards=1
   State=ALLOCATED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=standard 
   BootTime=2018-01-17T10:58:46 SlurmdStartTime=2018-06-07T09:40:49
   CfgTRES=cpu=16,mem=60000M,billing=16
   AllocTRES=cpu=16,mem=56000M
   CapWatts=n/a
   CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

Comment 1 Alejandro Sanchez 2018-06-08 06:08:51 MDT

We're working on restoring the previous behavior in bug 5240. I'm marking this as a duplicate of that one.

*** This ticket has been marked as a duplicate of ticket 5240 ***