Ticket 21285 - having issues with new install of slurmdbd and slurmctld
Summary: having issues with new install of slurmdbd and slurmctld
Status: OPEN
Alias: None
Product: Slurm
Classification: Unclassified
Component: Accounting (show other tickets)
Version: 24.05.3
Hardware: Linux Linux
: 6 - No support contract
Assignee: Jacob Jenson
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2024-10-28 08:33 MDT by Kelley McDonald
Modified: 2024-10-28 13:28 MDT (History)
1 user (show)

See Also:
Site: -Other-
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments

Note You need to log in before you can comment on or make changes to this ticket.
Description Kelley McDonald 2024-10-28 08:33:33 MDT
Hi support.

Getting the following records from slurmdbd.log

Oct 28 07:24:07 panther.cchem.berkeley.edu slurmdbd[2132]: slurmdbd: error: mysql_real_connect failed: 1045 Access denied for user 'slurm'@'localhost' (using password: YES)
Oct 28 07:24:07 panther.cchem.berkeley.edu slurmdbd[2132]: slurmdbd: error: The database must be up when starting the MYSQL plugin.  Trying again in 5 seconds.
Oct 28 07:24:12 panther.cchem.berkeley.edu slurmdbd[2132]: slurmdbd: error: mysql_real_connect failed: 1045 Access denied for user 'slurm'@'localhost' (using password: YES)
Oct 28 07:24:12 panther.cchem.berkeley.edu slurmdbd[2132]: slurmdbd: error: The database must be up when starting the MYSQL plugin.  Trying again in 5 seconds.

Newbie here, trying to figure out what I haven't done to set up mariadb properly,
slurmdbd seems to be running, mariadb is running, but then, when I try to start slurmctld, i get this Oct 28 07:25:04 panther.cchem.berkeley.edu slurmctld[7039]: slurmctld: error: slurmdbd: Invalid message version=9728, type:DBD_NODE_STATE
Oct 28 07:25:04 panther.cchem.berkeley.edu slurmctld[7039]: slurmctld: error: no buffer given
Oct 28 07:25:04 panther.cchem.berkeley.edu slurmctld[7039]: slurmctld: error: slurmdbd: Invalid message version=9728, type:DBD_NODE_STATE
Oct 28 07:25:04 panther.cchem.berkeley.edu slurmctld[7039]: slurmctld: error: no buffer given
Oct 28 07:25:04 panther.cchem.berkeley.edu slurmctld[7039]: slurmctld: error: slurmdbd: Invalid message version=9728, type:DBD_NODE_STATE
Oct 28 07:25:04 panther.cchem.berkeley.edu slurmctld[7039]: slurmctld: error: no buffer given
Oct 28 07:25:04 panther.cchem.berkeley.edu slurmctld[7039]: slurmctld: accounting_storage/slurmdbd: clusteracct_storage_p_register_ctld: Registering slurmctld at port 6817 with slurmdbd
Oct 28 07:25:04 panther.cchem.berkeley.edu slurmctld[7039]: slurmctld: error: Sending PersistInit msg: Connection refused
Oct 28 07:25:04 panther.cchem.berkeley.edu slurmctld[7039]: slurmctld: error: Sending PersistInit msg: Connection refused
Oct 28 07:25:04 panther.cchem.berkeley.edu slurmctld[7039]: slurmctld: fatal: Can not recover last_tres state, incompatible version, got 9728 need >= 9984 <= 10496, start with '-i' to ignore this.