Ticket 804

Summary: jobacct_gather/cgroup path for first step in batch job.
Product: Slurm Reporter: bart
Component: AccountingAssignee: Danny Auble <da>
Status: RESOLVED FIXED QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: bart, da
Version: 14.03.3   
Hardware: Linux   
OS: Linux   
Site: -Other- Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: 14.03.4 Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description bart 2014-05-12 23:09:23 MDT
cgroup for first step in batch job have path:

slurm/uid_<uid>/job_<job_id>/step_4294967294


DB
Comment 1 bart 2014-05-12 23:55:31 MDT
Fix based on task/cgroup/task_cgroup_cpuset.c:


diff jobacct_gather_cgroup_cpuacct.c  jobacct_gather_cgroup_cpuacct.c.old
171,179c171
<               if (stepid == NO_VAL) {
<                       if (snprintf(jobstep_cgroup_path, PATH_MAX, "%s/step_batch",
<                            job_cgroup_path) >= PATH_MAX) {
<                               error("jobacct_gather/cgroup: unable to build job step "
<                               "%u cpuacct cg relative path : %m", stepid);
<                               return SLURM_ERROR;
<                       }
<               } else {
<                       if (snprintf(jobstep_cgroup_path, PATH_MAX, "%s/step_%u",
---
>               if (snprintf(jobstep_cgroup_path, PATH_MAX, "%s/step_%u",
181,184c173,175
<                               error("jobacct_gather/cgroup: unable to build job step "
<                               "%u cpuacct cg relative path : %m", stepid);
<                               return SLURM_ERROR;
<                       }
---
>                       error("jobacct_gather/cgroup: unable to build job step "
>                             "%u cpuacct cg relative path : %m", stepid);
>                       return SLURM_ERROR;
188d178
<
Comment 2 Danny Auble 2014-05-13 05:09:14 MDT
Thanks, this is fixed in commit c57282943523bf7d786a2523c1743168913ff50a