3895 – jobacct_gather/cgroup scales usage by tasks

Ticket 3895 - jobacct_gather/cgroup scales usage by tasks

Summary: jobacct_gather/cgroup scales usage by tasks

Status:	RESOLVED DUPLICATE of ticket 3531

Alias:	None

Product:	Slurm
Classification:	Unclassified
Component:	Accounting (show other tickets)
Version:	17.02.3
Hardware:	Linux Linux

Severity:	4 - Minor Issue
Assignee:	Dominik Bartkiewicz
QA Contact:

URL:

Depends on:
Blocks:

Reported:	2017-06-14 10:38 MDT by Martins Innus
Modified:	2017-07-19 16:20 MDT (History)
CC List:	1 user (show)

See Also:
Site:	University of Buffalo (SUNY)
Slinky Site:	---
Alineos Sites:	---
Atos/Eviden Sites:	---
Confidential Site:	---
Coreweave sites:	---
Cray Sites:	---
DS9 clusters:	---
Google sites:	---
HPCnow Sites:	---
HPE Sites:	---
IBM Sites:	---
NOAA SIte:	---
NoveTech Sites:	---
Nvidia HWinf-CS Sites:	---
OCF Sites:	---
Recursion Pharma Sites:	---
SFW Sites:	---
SNIC sites:	---
Tzag Elita Sites:	---
Linux Distro:	---
Machine Name:
CLE Version:
Version Fixed:
Target Release:	---
DevPrio:	---
Emory-Cloud Sites:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this ticket.

Description Martins Innus 2017-06-14 10:38:16 MDT

Hi,
  Using jobacct_gather/cgroup appears to scale usage data in sacct by the # of tasks per node.  Given the following cpu bound set of test jobs, we'd expect usercpu to be roughly ntasks * walltime.  But it ends up being roughly ntasks^2 * walltime:

jobid   nodesxntasks-per-node usercpu walltime
6765457 1x1     1:31:51 1:34:16
6764424 1x2     3:17:53   51:34
6766943 1x8    21:43:00   21:59
6763262 1x16 2-03:54:32   13:44

This seems to be due to the fact that the the job accounting infrastructure expects accounting to be done by task (from slurmstepd/req.c):

        for (i = 0; i < job->node_tasks; i++) {
                temp_jobacct = jobacct_gather_stat_task(job->task[i]->pid);
                if (temp_jobacct) {
                        jobacctinfo_aggregate(jobacct, temp_jobacct);
                        jobacctinfo_destroy(temp_jobacct);
                        num_tasks++;
                }
        }

Which in the end calls jobacct_gather_cgroup.c:_prec_extra, which as far as I can tell returns accounting information for the whole step and not each task because I think in cgroup accounting all the task pids get lumped under a step cgroup with no differentiation between tasks for accounting purposes.

This loop is what then causes the extra multiplication by # of tasks.

Thanks for any help in solving this.

Martins

Comment 1 Martins Innus 2017-06-14 12:05:00 MDT

And I should mention that I have applied attachment 4185 [details] from:

https://bugs.schedmd.com/show_bug.cgi?id=3531

to get cgroup accounting working at all.  Before applying that patch, we saw the same "0" values as reported in that bug report for memory.

Comment 2 Martins Innus 2017-06-14 12:41:14 MDT

Yeah, that patch doesn't now seem the right way to fix this.

Sorry for the confusion.  I'll do some more testing on a stock 17.02 and try to come up with a better bug report.

Comment 3 Dominik Bartkiewicz 2017-06-15 06:59:46 MDT

Hi

I will try to improve this patch or find other solution for bug 3531.

Dominik

Comment 4 Martins Innus 2017-06-15 07:05:41 MDT

OK, thanks!  I don’t have a complete handle on it yet.

But my best guess is a race condition when running all of:

JobAcctGatherType       = jobacct_gather/cgroup
ProctrackType           = proctrack/cgroup
TaskPlugin              = task/cgroup


With stock 17.02.03, when running those plugins and multiple tasks on a node, some PIDS get put in the task cgroup and some PIDS get put in step cgroup.  I believe that is the root cause.

Martins



On Jun 15, 2017, at 8:59 AM, bugs@schedmd.com<mailto:bugs@schedmd.com> wrote:


Comment # 3<https://bugs.schedmd.com/show_bug.cgi?id=3895#c3> on bug 3895<https://bugs.schedmd.com/show_bug.cgi?id=3895> from Dominik Bartkiewicz<mailto:bart@schedmd.com>

Hi

I will try to improve this patch or find other solution for bug 3531<x-msg://11/show_bug.cgi?id=3531>.

Dominik

________________________________
You are receiving this mail because:

  *   You reported the bug.

Comment 8 Danny Auble 2017-07-19 15:35:01 MDT

This was solved with a different patch in 3531.

*** This ticket has been marked as a duplicate of ticket 3531 ***

Comment 9 Martins Innus 2017-07-19 16:20:46 MDT

Great thanks Danny!

On Jul 19, 2017, at 5:35 PM, "bugs@schedmd.com<mailto:bugs@schedmd.com>" <bugs@schedmd.com<mailto:bugs@schedmd.com>> wrote:

Danny Auble<mailto:da@schedmd.com> changed bug 3895<https://bugs.schedmd.com/show_bug.cgi?id=3895>
What    Removed Added
Status  UNCONFIRMED     RESOLVED
Resolution      ---     DUPLICATE

Comment # 8<https://bugs.schedmd.com/show_bug.cgi?id=3895#c8> on bug 3895<https://bugs.schedmd.com/show_bug.cgi?id=3895> from Danny Auble<mailto:da@schedmd.com>

This was solved with a different patch in 3531.

*** This bug has been marked as a duplicate of bug 3531<show_bug.cgi?id=3531> ***

________________________________
You are receiving this mail because:

  *   You reported the bug.