Ticket 10195 - Power saving and node weight
Summary: Power saving and node weight
Status: RESOLVED DUPLICATE of ticket 9734
Alias: None
Product: Slurm
Classification: Unclassified
Component: slurmrestd (show other tickets)
Version: 20.02.5
Hardware: Linux Linux
: --- 4 - Minor Issue
Assignee: Director of Support
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2020-11-11 02:38 MST by OCF Support
Modified: 2020-11-11 10:12 MST (History)
2 users (show)

See Also:
Site: OCF
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
OCF Sites: Schlumberger
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description OCF Support 2020-11-11 02:38:28 MST
The slurm power saving feature has been turned on, so that idle nodes are powered off when not in use. It has been noticed that new jobs are scheduled to nodes that have recently been powered on, rather than nodes that should be prioritised due to the node "Weight". It looks like there are already bugs files in the slurm bugtracker (9548 & 9734) , currently marked as "under consideration".
Comment 1 OCF Support 2020-11-11 02:39:44 MST
Before the customer started using the powersave feature, all other things being equal, jobs would be scheduled to nodes[001-051] in preference to gnodes[001-016], as the Weight was lower.

After turning on the powersave feature, the weight appears to be ignored. Everything else works as expected (eg specific feature or resource requests).
Comment 2 Nate Rini 2020-11-11 10:12:00 MST
I'm going to mark this as a duplicate:

(In reply to Broderick Gardner from bug#9734 comment #1)
> Yes, as it is currently designed, nodes in a powered-down state are in a
> lower tier altogether, making weights useless for cloud. I'm looking into
> what the scope for an enhancement would be in this regard.

Please take a look at bug#9734 for continuing updates.

*** This ticket has been marked as a duplicate of ticket 9734 ***