Summary: | Magnetic reservation does not attract jobs if a job on the magnetic reservation node is in 'CG' state' | ||
---|---|---|---|
Product: | Slurm | Reporter: | Bas van der Vlies <bas.vandervlies> |
Component: | Scheduling | Assignee: | Dominik Bartkiewicz <bart> |
Status: | RESOLVED INFOGIVEN | QA Contact: | |
Severity: | 4 - Minor Issue | ||
Priority: | --- | CC: | jaap.dijkshoorn |
Version: | 20.02.7 | ||
Hardware: | Linux | ||
OS: | Linux | ||
See Also: | https://bugs.schedmd.com/show_bug.cgi?id=12350 | ||
Site: | SURF | Alineos Sites: | --- |
Atos/Eviden Sites: | --- | Confidential Site: | --- |
Coreweave sites: | --- | Cray Sites: | --- |
DS9 clusters: | --- | HPCnow Sites: | --- |
HPE Sites: | --- | IBM Sites: | --- |
NOAA SIte: | --- | OCF Sites: | --- |
Recursion Pharma Sites: | --- | SFW Sites: | --- |
SNIC sites: | --- | Linux Distro: | --- |
Machine Name: | CLE Version: | ||
Version Fixed: | Target Release: | --- | |
DevPrio: | --- | Emory-Cloud Sites: | --- |
Description
Bas van der Vlies
2021-08-27 06:40:15 MDT
This is even triggered with only 1 job on a node. As soon as that job is in the CG state, other jobs are scheduled on another node. When we add the '--reservation=magentic' on the command line it will go to the 'PD' state: * 2739 shared submit.s bas PD 0:00 1 (Resources) and wait till the job with 'CG' state has been finished. Then we can submit jobs again and the node is accepting the jobs till another job is in the 'CG' state. The question is is the 'CG' state a blocking state for scheduling jobs on a node and is there option to override it? Thanks Bas I read from the pages that this is the expected behaviour and there are options for it: CompleteWait and reduce_completing_frag, We have to reduce the time spent in the slurmd EpiLog script. This issue can be closed. Bas |