Bug 3895

Summary: jobacct_gather/cgroup scales usage by tasks
Product: Slurm Reporter: Martins Innus <minnus>
Component: AccountingAssignee: Dominik Bartkiewicz <bart>
Status: RESOLVED DUPLICATE QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: da
Version: 17.02.3   
Hardware: Linux   
OS: Linux   
Site: University of Buffalo (SUNY) Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description Martins Innus 2017-06-14 10:38:16 MDT
Hi,
  Using jobacct_gather/cgroup appears to scale usage data in sacct by the # of tasks per node.  Given the following cpu bound set of test jobs, we'd expect usercpu to be roughly ntasks * walltime.  But it ends up being roughly ntasks^2 * walltime:

jobid   nodesxntasks-per-node usercpu walltime
6765457 1x1     1:31:51 1:34:16
6764424 1x2     3:17:53   51:34
6766943 1x8    21:43:00   21:59
6763262 1x16 2-03:54:32   13:44

This seems to be due to the fact that the the job accounting infrastructure expects accounting to be done by task (from slurmstepd/req.c):

        for (i = 0; i < job->node_tasks; i++) {
                temp_jobacct = jobacct_gather_stat_task(job->task[i]->pid);
                if (temp_jobacct) {
                        jobacctinfo_aggregate(jobacct, temp_jobacct);
                        jobacctinfo_destroy(temp_jobacct);
                        num_tasks++;
                }
        }

Which in the end calls jobacct_gather_cgroup.c:_prec_extra, which as far as I can tell returns accounting information for the whole step and not each task because I think in cgroup accounting all the task pids get lumped under a step cgroup with no differentiation between tasks for accounting purposes.

This loop is what then causes the extra multiplication by # of tasks.

Thanks for any help in solving this.

Martins
Comment 1 Martins Innus 2017-06-14 12:05:00 MDT
And I should mention that I have applied attachment 4185 [details] from:

https://bugs.schedmd.com/show_bug.cgi?id=3531

to get cgroup accounting working at all.  Before applying that patch, we saw the same "0" values as reported in that bug report for memory.
Comment 2 Martins Innus 2017-06-14 12:41:14 MDT
Yeah, that patch doesn't now seem the right way to fix this.

Sorry for the confusion.  I'll do some more testing on a stock 17.02 and try to come up with a better bug report.
Comment 3 Dominik Bartkiewicz 2017-06-15 06:59:46 MDT
Hi

I will try to improve this patch or find other solution for bug 3531.

Dominik
Comment 4 Martins Innus 2017-06-15 07:05:41 MDT
OK, thanks!  I don’t have a complete handle on it yet.

But my best guess is a race condition when running all of:

JobAcctGatherType       = jobacct_gather/cgroup
ProctrackType           = proctrack/cgroup
TaskPlugin              = task/cgroup


With stock 17.02.03, when running those plugins and multiple tasks on a node, some PIDS get put in the task cgroup and some PIDS get put in step cgroup.  I believe that is the root cause.

Martins



On Jun 15, 2017, at 8:59 AM, bugs@schedmd.com<mailto:bugs@schedmd.com> wrote:


Comment # 3<https://bugs.schedmd.com/show_bug.cgi?id=3895#c3> on bug 3895<https://bugs.schedmd.com/show_bug.cgi?id=3895> from Dominik Bartkiewicz<mailto:bart@schedmd.com>

Hi

I will try to improve this patch or find other solution for bug 3531<x-msg://11/show_bug.cgi?id=3531>.

Dominik

________________________________
You are receiving this mail because:

  *   You reported the bug.
Comment 8 Danny Auble 2017-07-19 15:35:01 MDT
This was solved with a different patch in 3531.

*** This bug has been marked as a duplicate of bug 3531 ***
Comment 9 Martins Innus 2017-07-19 16:20:46 MDT
Great thanks Danny!

On Jul 19, 2017, at 5:35 PM, "bugs@schedmd.com<mailto:bugs@schedmd.com>" <bugs@schedmd.com<mailto:bugs@schedmd.com>> wrote:

Danny Auble<mailto:da@schedmd.com> changed bug 3895<https://bugs.schedmd.com/show_bug.cgi?id=3895>
What    Removed Added
Status  UNCONFIRMED     RESOLVED
Resolution      ---     DUPLICATE

Comment # 8<https://bugs.schedmd.com/show_bug.cgi?id=3895#c8> on bug 3895<https://bugs.schedmd.com/show_bug.cgi?id=3895> from Danny Auble<mailto:da@schedmd.com>

This was solved with a different patch in 3531.

*** This bug has been marked as a duplicate of bug 3531<show_bug.cgi?id=3531> ***

________________________________
You are receiving this mail because:

  *   You reported the bug.