Ticket 11373

Summary: sacct and slurmrestd gave different results
Product: Slurm Reporter: Marco Induni <marco.induni>
Component: slurmdbdAssignee: Marcin Stolarek <cinek>
Status: RESOLVED INFOGIVEN QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: nate, tim
Version: 20.11.4   
Hardware: Linux   
OS: Linux   
Site: CSCS - Swiss National Supercomputing Centre Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: ---
Machine Name: CLE Version:
Version Fixed: Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description Marco Induni 2021-04-14 09:00:02 MDT
Dear support,
as reported on bug #11330, during some test I discovered some unexpected differences between the command 

sacct -j [JOBID] 

and a restapi call

GET 'http://node:6821/slurmdb/v0.0.36/job/[JOBID]

As I have understood the main difference is that in the restapi call some undocumented parameter are added during the call, like (-D --whole-hetjob=yes) and this cause a different output.

I think the 2 results should be the same, so a way to specify to use or not the "-D --whole-hetjob=yes" need to be added.



Thank you
Marco Induni
Comment 6 Marcin Stolarek 2021-04-30 01:59:59 MDT
Marco,

We had an internal discussion on that, which I'd like to summarise in a few clear points.

1) The response you get for both `sacct -D --whole-hetjob=yes -j JOBID` and appropriate slurmrestd endpoint are probably wrong. It may be due to a bug or database corruption. If you want us to work on that with you please open an appropriate bug report (it's not related to slurmrestd).

2) JobId is not a unique identifier in Slurm accounting (JobId + Submit Time is). This is why `-D` is injected in slurmrestd endpoint query and we think it should remain like that. REST API users should be able to easily filter the result.

3) If you'd like us to implement additional boolean parameters like -D or --whole-hetjob in slurmdb/v../job/ endpoint this should be treated as a paid RFE. Please let us know if you want us to prepare an SOW for that.


cheers,
Marcin
Comment 7 Marco Induni 2021-05-04 02:01:53 MDT
(In reply to Marcin Stolarek from comment #6)
Marcin,
 
> 1) The response you get for both `sacct -D --whole-hetjob=yes -j JOBID` and
> appropriate slurmrestd endpoint are probably wrong. It may be due to a bug
> or database corruption. If you want us to work on that with you please open
> an appropriate bug report (it's not related to slurmrestd).

Ok, I will open a ticket for this one

> 
> 2) JobId is not a unique identifier in Slurm accounting (JobId + Submit Time
> is). This is why `-D` is injected in slurmrestd endpoint query and we think
> it should remain like that. REST API users should be able to easily filter
> the result.

Fine for me

> 
> 3) If you'd like us to implement additional boolean parameters like -D or
> --whole-hetjob in slurmdb/v../job/ endpoint this should be treated as a paid
> RFE. Please let us know if you want us to prepare an SOW for that.
> 
> 
I think this should be solved if point 1) will be fixed

Cheers,
Marco
Comment 8 Marcin Stolarek 2021-05-04 02:06:43 MDT
I'm closing the ticket then as information given.

Should you have any question please reopen.

cheers,
Marcin