Summary: | Error getting jobs with sacct from dbd - DBD_GET_JOBS_COND | ||
---|---|---|---|
Product: | Slurm | Reporter: | hpc-ops |
Component: | Accounting | Assignee: | Patrick Wigger <patrick> |
Status: | OPEN --- | QA Contact: | |
Severity: | 4 - Minor Issue | ||
Priority: | --- | ||
Version: | 24.05.5 | ||
Hardware: | Linux | ||
OS: | Linux | ||
Site: | Ghent | Alineos Sites: | --- |
Atos/Eviden Sites: | --- | Confidential Site: | --- |
Coreweave sites: | --- | Cray Sites: | --- |
DS9 clusters: | --- | HPCnow Sites: | --- |
HPE Sites: | --- | IBM Sites: | --- |
NOAA SIte: | --- | NoveTech Sites: | --- |
Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
Recursion Pharma Sites: | --- | SFW Sites: | --- |
SNIC sites: | --- | Tzag Elita Sites: | --- |
Linux Distro: | --- | Machine Name: | |
CLE Version: | Version Fixed: | ||
Target Release: | --- | DevPrio: | --- |
Emory-Cloud Sites: | --- |
Description
hpc-ops
2025-01-16 02:35:26 MST
Hi Andy, To debug this further, could you please: 1. Enable detailed logging using DebugFlags=DB_QUERY,DB_JOB in slurmdbd.conf 2. Run the problematic sacct command once again 3. While it is hanging, run mysql> SHOW processlist; to inspect database activity. 4. Submit the slurmdbd log that captures the sacct duration and reset DebugFlags to prevent extra log collection. Does this behavior occur when running on shorter time intervals specified by starttime and endtime? Additionally, could you check the output of "sacctmgr show runawayjobs". This will list any completed jobs that are missing an end time in the database. Best, Patrick |