Ticket 1048

Summary: AccountingStorageEnforce=safe ignores UsageFactor
Product: Slurm Reporter: Tom Payerle <payerle>
Component: ContributionsAssignee: Jacob Jenson <jacob>
Status: RESOLVED TIMEDOUT QA Contact:
Severity: 6 - No support contract    
Priority: --- CC: jacob, kevin, payerle
Version: 14.03.6   
Hardware: Linux   
OS: Linux   
Site: -Other- Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---
Attachments: Patch to fix UsageFactor

Description Tom Payerle 2014-08-18 08:41:01 MDT
It looks like when AccountingStorageEnforce=safe and slurmctld checks whether
sufficient time remains in GrpCPULimits, it does not account for UsageFactor.

In particular, we have a preemptible scavenger partition with corresponding QoS which has UsageFactor=0.  GrpCPUMins not set on the QoS, but there is a GrpCPUMins on the account being "charged".  
 
It looks like if a job for this partition/QoS for which num_tasks * time >
accounts GrpCPULimits - usage, the job will pend with AssociationJobLimit.

This seems to be wait user is complaining about, and I see similar symptoms on
a test case.

If I look at acct_policy.c, starting around line 1666), the job_cpu_time_limit 
(which looks to be just number cpus * time requested) and cpu_run_mins are summed
and compared to GrpCPULimits - usage, without checking if there is a USageFactor
on the QoS.

I think it should be simple enough to add that check, and multiple job_cpu_time
by UsageFactor if present.  I am not familiar enough with cpu_run_mins to know
if that already has UsageFactor included, and certainly not enough to add it if
it were missing.
Comment 1 Tom Payerle 2014-08-20 00:43:25 MDT
OK, based on some more observed behavior, I believe cpu_run_mins is also ignoring the UsageFactor.

The account in question has 3000 minute limit, and almost no usage.  An user
submitted jobs around or over 3000 core-minutes to the scavenger partition/QoS, and they were pending with AssociationJobLimit even though UsageFactor for this
QoS is 0.  I manually upped the GrpCPUMins for the account (by factor of 10 IIRC) temporarily, and the scavenger jobs started.  Even after 12+ hours, sshare is still showing almost no usage on the account (as expected, since UsageFactor=0 so the jobs should not charge against the account).

Now, another user for the same account is attempting to run much smaller jobs, against a "normal" partition/QoS which does charge the account.  These too are pending with AssociationJobLimit.  I believe this is because the AccountingStorageEnforce=safe check that there are sufficient funds to complete already running jobs plus the job we wish to start is failing because the jobs in the scavenger partition are not getting the UsageFactor of 0 applied, and there is insufficient funds in the account to allow them to finish without that.  (There are the scavenger partition and the new user's small jobs currently being charged against the account in question, and only the scavenger partition jobs are in running state)
Comment 2 Danny Auble 2014-08-20 08:21:10 MDT
Your simple fix should work for GrpCPUMins, but GrpCPURunMins is a much more complicated issue since it would need to deal with UsageFactors that could change while jobs are running.
Comment 3 Jacob Jenson 2014-08-22 08:25:21 MDT
If you would like further assistance with this request please let me know and we can discuss if a support contract or custom development work would help you meet your objectives and resolve this issue.
Comment 4 Kevin Hildebrand 2014-09-23 01:38:20 MDT
Created attachment 1253 [details]
Patch to fix UsageFactor

I've developed a patch to account for the usage factor when starting a job.
From what I can tell, all of the tracked information in the qos and assoc structures is already correct and includes the usage factor.  The only place where there is an error is in the comparisons in acct_policy.c which do not take the usage factor into account.

Regarding the usage factor changing while jobs are running- is that really an issue here?  The accounting will still be correct and will accrue the proper usage if the factor changes mid job.  The only thing we're concerned with here is whether or not a job is eligible to start, and the present usage factor will be used for that calculation.

My patch will skip the CPU time limit checks if the usage factor is zero, otherwise it will properly account for the usage factor.

The patch attached is for version 14.03.8.

Thanks,
Kevin
Comment 8 Jacob Jenson 2017-12-13 10:30:42 MST
Tom, 

The version of Slurm referenced in this ticket is very old so we ware timing out this ticket. 

Jacob