I had to report another issue for magnetic reservation, but it came from: * bug 12350, comment 6 And I got a response: * bug 12350, comment 11 I will summarize it I create a magnetic reservation on a a 16 core node. The trick to trigger this is not the response test but the test I try to describe: * terminal 1: watch -n1 "squeue -u <username> | sort" * terminal 2: submit 4 jobs --> these jobs are scheduled on the magnetic reservation node. * terminal 1: see if a job is in "CG" state * terminal 2: submit quickly another 4 jobs --> These jobs are scheduled on other nodes due to the "CG" state. If there are no jobs in the "CG" state I can just submit the jobs and they are scheduled to the reservation: * terminal 2: submit 4 jobs * wait 1 sec * terminal 2: again submit 4 jobs * All these jobs end up the magnetic reservation node. regards Bas
This is even triggered with only 1 job on a node. As soon as that job is in the CG state, other jobs are scheduled on another node.
When we add the '--reservation=magentic' on the command line it will go to the 'PD' state: * 2739 shared submit.s bas PD 0:00 1 (Resources) and wait till the job with 'CG' state has been finished. Then we can submit jobs again and the node is accepting the jobs till another job is in the 'CG' state. The question is is the 'CG' state a blocking state for scheduling jobs on a node and is there option to override it? Thanks Bas
I read from the pages that this is the expected behaviour and there are options for it: CompleteWait and reduce_completing_frag, We have to reduce the time spent in the slurmd EpiLog script. This issue can be closed. Bas