Summary: | Floating Partitions and Pending Jobs | ||
---|---|---|---|
Product: | Slurm | Reporter: | Stephen Fralich <sjf4> |
Component: | Scheduling | Assignee: | Unassigned Developer <dev-unassigned> |
Status: | OPEN --- | QA Contact: | |
Severity: | 5 - Enhancement | ||
Priority: | --- | CC: | ahough, mattjay, sts |
Version: | 17.02.4 | ||
Hardware: | Linux | ||
OS: | Linux | ||
Site: | University of Washington | Alineos Sites: | --- |
Atos/Eviden Sites: | --- | Confidential Site: | --- |
Coreweave sites: | --- | Cray Sites: | --- |
DS9 clusters: | --- | HPCnow Sites: | --- |
HPE Sites: | --- | IBM Sites: | --- |
NOAA SIte: | --- | OCF Sites: | --- |
Recursion Pharma Sites: | --- | SFW Sites: | --- |
SNIC sites: | --- | Linux Distro: | --- |
Machine Name: | CLE Version: | ||
Version Fixed: | Target Release: | --- | |
DevPrio: | --- | Emory-Cloud Sites: | --- |
Description
Stephen Fralich
2017-06-08 11:52:27 MDT
I can classify this as an enhancement request if you'd like; but as I'd indicated this is not a trivial issue, but rather a side-effect of the architecture of Slurm's preemption and prioritization model. Unfortunately, I don't have a great workaround for the combination of preemption and floating partitions - as the floating partitions "cast a shadow" across all nodes in the lower priority partition, any pending job in the floating partition would cause this issue. The only suggestions I can make at the moment are to allow the floating partitions to only "float a little bit" over a reduced node count, or look into creating a short-wall-time partition with the same PriorityTier as the condo partitions, but with a reduced PriorityJobFactor to ensure the owner jobs take precedence. - Tim Yes, an enhancement for sure, but it'll never happen if we don't ask, so this is us asking. I didn't know what the proper avenue was for making such a request. Yeah, I'm probably going to have one floating partition where small node customers nodes will all be pooled and then separate fixed partitions for large node customers or customers that queue a lot of work. I'll have to read up on PriorityJobFactor and then I'll have to think it over. We'll see how it goes. Thanks for the reply and suggestions. (In reply to Stephen Fralich from comment #2) > Yes, an enhancement for sure, but it'll never happen if we don't ask, so > this is us asking. I didn't know what the proper avenue was for making such > a request. Tagging appropriately. > Yeah, I'm probably going to have one floating partition where small node > customers nodes will all be pooled and then separate fixed partitions for > large node customers or customers that queue a lot of work. I'll have to > read up on PriorityJobFactor and then I'll have to think it over. We'll see > how it goes. > > Thanks for the reply and suggestions. Certainly happy to help. |