Bug 1863 - RFE adjust job array task limit dynamically
Summary: RFE adjust job array task limit dynamically
Status: RESOLVED FIXED
Alias: None
Product: Slurm
Classification: Unclassified
Component: Scheduling (show other bugs)
Version: 14.11.8
Hardware: Linux Linux
: --- 5 - Enhancement
Assignee: Moe Jette
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2015-08-13 00:23 MDT by John Hanks
Modified: 2015-09-26 18:50 MDT (History)
2 users (show)

See Also:
Site: KAUST
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed: 15.08.1
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description John Hanks 2015-08-13 00:23:15 MDT
Hello,

On a number of occasions we've had a task come up which was quickly solved by dumping a list of commands or filenames or whatever into a text file, then submitting an array job which pulls out a single line based on it's task id and processes whatever is in that line. Submitting with % and using a limit on running tasks makes this pretty easy to throttle something that might have otherwise overwhelmed a fileserver or net connection. However, using that % locks us into that limit for the life of the job. If there is a lull in usage of the limited resource by other jobs, it'd be nice to be able to increase/decrease the % value to manage the number of running jobs to maximize the use of said resource. 

My requested enhancement then is that ArrayTaskID be split into ArrayTaskID and ArrayTaskThrottle with ArrayTaskThrottle being update-able dynamically with scontrol update.

Thanks,

jbh
Comment 1 Moe Jette 2015-08-13 02:44:15 MDT
That should be simple to do, but its too late to get into version 15.08, at least the initial release.
Comment 2 Moe Jette 2015-09-25 04:38:41 MDT
As expected, this was very simple to fix.

Code/document update here:
https://github.com/SchedMD/slurm/commit/56b0ff1c5a85b78530ec8bd5172fb851ef2ca1aa

Test added here:
https://github.com/SchedMD/slurm/commit/84f8662f7d2a7ea7fd08f6d5f1d08b38251ff84e

Here's how it works:
Set/change limit:
scontrol  update jobid=1933 arraytaskthrottle=1

Clear limit:
scontrol  update jobid=1933 arraytaskthrottle=0
Comment 3 John Hanks 2015-09-26 18:50:38 MDT
Moe,

This works brilliantly, thanks for putting it in there.

jbh