Summary: | sacctmgr -i delete user where name=... takes 30+ minutes(never visibly completes) | ||
---|---|---|---|
Product: | Slurm | Reporter: | Adam <asa188> |
Component: | Database | Assignee: | Chad Vizino <chad> |
Status: | OPEN --- | QA Contact: | |
Severity: | 4 - Minor Issue | ||
Priority: | --- | CC: | kaizaad, nathan.wielenga |
Version: | 24.11.0 | ||
Hardware: | Linux | ||
OS: | Linux | ||
Site: | Simon Fraser University | Alineos Sites: | --- |
Atos/Eviden Sites: | --- | Confidential Site: | --- |
Coreweave sites: | --- | Cray Sites: | --- |
DS9 clusters: | --- | HPCnow Sites: | --- |
HPE Sites: | --- | IBM Sites: | --- |
NOAA SIte: | --- | NoveTech Sites: | --- |
Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
Recursion Pharma Sites: | --- | SFW Sites: | --- |
SNIC sites: | --- | Tzag Elita Sites: | --- |
Linux Distro: | CentOS | Machine Name: | |
CLE Version: | Version Fixed: | ||
Target Release: | --- | DevPrio: | --- |
Emory-Cloud Sites: | --- |
Description
Adam
2025-01-30 13:58:44 MST
A simple way that I also tested it was a new table with only the fields needed. create table test_table (id_assoc int(10) unsigned primary key, user tinytext, acct tinytext, lineage text, is_def tinyint(4), deleted tinyint(4), index(user,acct)); insert into test_table select (seq) id_assoc, concat('myname', (seq)) user, concat('myacct', (seq)) acct, concat('mylineage', (seq)) lineage, 1 is_def, 0 deleted from seq_1_to_40000; Hi. That's good information, especially the test--thanks. I agree that the query is not constructed well. Another site has also identified this issue and I have an initial patch for it that I'm testing. I'll supply more once I've finished that step. Hi, I'm curious if any progress has been made on this. Might it make it into 24.11.3 We have another site that upgraded to 24.11.x recently and they have 180,000 records in their assoc table, so they're unable to deletions now as well. Thanks, Adam (In reply to Adam from comment #4) > I'm curious if any progress has been made on this. Might it make it into > 24.11.3 > > We have another site that upgraded to 24.11.x recently and they have 180,000 > records in their assoc table, so they're unable to deletions now as well. Hi. I've been out of the office for a bit and am catching up. I have a patch together for this but 24.11.3 needed to be pushed out quickly and I wasn't ready in time. The patch is a priority for me now and will let you know when we have it committed. (In reply to Chad Vizino from comment #5) > (In reply to Adam from comment #4) > > I'm curious if any progress has been made on this. Might it make it into > > 24.11.3 > > > > We have another site that upgraded to 24.11.x recently and they have 180,000 > > records in their assoc table, so they're unable to deletions now as well. > Hi. I've been out of the office for a bit and am catching up. I have a patch > together for this but 24.11.3 needed to be pushed out quickly and I wasn't > ready in time. The patch is a priority for me now and will let you know when > we have it committed. Hi Chad, Any progress on this, by the time we have any movement on these SQL queries, we'll be at 24.11.6 it seems. This query is just one of a few, the startup time of SlurmDB is also very long due to other questionable queries during the startup process, but, this specific query is of more concern. Thanks, Adam (In reply to Adam from comment #6) > Any progress on this, by the time we have any movement on these SQL queries, > we'll be at 24.11.6 it seems. > > This query is just one of a few, the startup time of SlurmDB is also very > long due to other questionable queries during the startup process, but, this > specific query is of more concern. Hi Adam. Sorry this has been delayed. As you point out, there are other long queries--we are looking at one in particular (it's in _get_user_coords()) that is delaying startup for sites with larger assoc tables. That query also does a 2-way join on the assoc table and we are eliminating the join so it's much faster (being internally reviewed). As I mentioned earlier, I have a fix for the delete issue but it's still not quite ready for internal review and may change a bit depending on how things go with this other one. So, hopefully not too much longer till the delete issue can get some attention and possible improvement from what we've learned. I'll give an update soon when we are certain of the direction we are going since it's is an important issue also, especially when the assoc table is larger. |