Summary: | DefMemPerGPU is ignored by scheduler and is only used after the job has started | ||
---|---|---|---|
Product: | Slurm | Reporter: | Martijn Kruiten <martijn.kruiten> |
Component: | Scheduling | Assignee: | Jacob Jenson <jacob> |
Status: | RESOLVED FIXED | QA Contact: | |
Severity: | 6 - No support contract | ||
Priority: | --- | CC: | bas.vandervlies, cinek, jess, kaylea.nelson, martijn.kruiten, miguel.gila |
Version: | 20.11.4 | ||
Hardware: | Linux | ||
OS: | Linux | ||
Site: | -Other- | Alineos Sites: | --- |
Atos/Eviden Sites: | --- | Confidential Site: | --- |
Coreweave sites: | --- | Cray Sites: | --- |
DS9 clusters: | --- | HPCnow Sites: | --- |
HPE Sites: | --- | IBM Sites: | --- |
NOAA SIte: | --- | OCF Sites: | --- |
Recursion Pharma Sites: | --- | SFW Sites: | --- |
SNIC sites: | --- | Linux Distro: | --- |
Machine Name: | CLE Version: | ||
Version Fixed: | 20.11.4 | Target Release: | --- |
DevPrio: | --- | Emory-Cloud Sites: | --- |
Description
Martijn Kruiten
2020-02-18 03:52:30 MST
Still present in Slurm 20.11.4 (and most probably 20.11.5). So basically if you submit a job using some DefMemPerGpu or --mem-per-gpu value it will never be considered to run on an oversubscribed node where another job is currently running, even though it will fit. It will act as if 100% of the available node memory is requested. Only when the job has started it will take the allocated memory into account, permitting other jobs to run on that node (unless they are also using DefMemPerGpu or --mem-per-gpu of course). When we tested this issue on our system (20.02.6) we noticed that cgroups are also appear to be set to the DefMemPerCPU (times number of cpus) value instead of based on the --mem-per-gpu flag. This seemed to have been caused by our job_submit.lua script which was using the older job_desc.gres instead of job_desc.tres_per_{job,task,socket,node}. |