Ticket 11044

Summary: submit to multiple partitions with GRES specified with patch
Product: Slurm Reporter: Bas van der Vlies <bas.vandervlies>
Component: SchedulingAssignee: Tim Wickberg <tim>
Status: OPEN --- QA Contact:
Severity: C - Contributions    
Priority: --- CC: dennis.stam, martijn.kruiten
Version: 20.11.4   
Hardware: Linux   
OS: Linux   
Site: SURF Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---
Attachments: multipartition gres fix
Also add patch for 20.02 version

Description Bas van der Vlies 2021-03-09 07:08:43 MST
Created attachment 18305 [details]
multipartition gres fix

At our site we have a lot of partitions for the different CPU/GPU types. So to make it easy for the user we have written a job_submit.lua script that submit to all cpu partitions or gpu partitions.  But this fails when we make use of GRES specification.

In this cluster we do not have GPU so i defined a GRES type:
 * cpu_type

We have defined 2 no consumble flagcount only GRES:
 * e5_2650_v1 
 * e5_2650_v2

two partitions:
 * cpu_e5_2650_v1 --> 1 node
 * cpu_e5_2650_v2 --> 1 node

The last partition checked is `cpu_e5_2650_v2`. This important for this example.

No we submit a job that require GRES `e5_2650_v1`:
 * srun --exclusive  --gres=cpu_type:e5_2650_v1 --pty /bin/bash 
 * second job with the same GRES type fails with:
```
srun: error: Unable to allocate resources: Requested node configuration is not available
``` 

When we use the other GRES type `e5_2650_v2` the second job is queued as I would expect also for the above example. So the error code of the last partition determines the error code.

when we use the GRES type `e5_2650_v1` then the last partition `cpu_e5_2650_v2` returns `SLURM_REQUESTED_NODE_CONFIG_UNAVAILABLE = 2014`. That is also returned to the user. But in the first partition it could run but all nodes are busy `ESLURM_NODES_BUSY = 2016`. We should return this state when we have examined all partitions.

The patch attached implements this behaviour. Do not know if this is the right approach.
Comment 1 Bas van der Vlies 2021-03-24 11:39:22 MDT
Created attachment 18629 [details]
Also add patch for 20.02 version

This is the multipartition fix for slurm version 20.02. We are also using this version
Comment 2 Bas van der Vlies 2021-09-10 04:57:12 MDT
Is thers an update on this issue. Will it be addressed or is there a fix in a newer slurm version?