Hi I setup the mysql user slurm password to PASSWORD. When running sbatch, Munge uses the word PASSWORD as the communication socket !!! To make thing works I set mysql slurm user password to "/var/run/munge/munge.socket.2", AccountingStoragePass to the same value and everything works fine now. Mysql & Munge are happy. Regards Marc config: AccountingStorageType=accounting_storage/mysql AccountingStorageHost=msfdev-msf.AAA.dmz AccountingStorageEnforce=associations,limits,qos AccountingStorageUser=slurm AccountingStoragePass=PASSWORD command sbatch -o /home/MSF/slurm/%j.log -e /home/MSF/slurm/%j.log -M msf -A transcription /home/MSF/slurm/test.sh sbatch: error: Munge encode failed: Failed to access "PASSWORD": No such file or directory (retrying ...) sbatch: error: Munge encode failed: Failed to access "PASSWORD": No such file or directory (retrying ...) sbatch: error: Munge encode failed: Failed to access "PASSWORD": No such file or directory sbatch: error: authentication: Socket communication error sbatch: error: Batch job submission failed: Protocol authentication error
Cannot reproduce. David
I am able to reproduce this exact condition. I have to set my StoragePass=/var/run/munge/munge.socket.2 in slurmdb.conf and and AccountingStoragePass=/var/run/munge/munge.socket.2 in slurm.conf the user password in mariadb to the same value to get sacct to function. If I set these two value in the conf files to an actual password that matches the user password in mariadb I get the following error when I run sacct lockhart-login1:~ # sacct sacct: error: If munged is up, restart with --num-threads=10 sacct: error: Munge encode failed: Failed to access "PASSWORD": No such file or directory sacct: error: slurm_send_node_msg: g_slurm_auth_create: REQUEST_PERSIST_INIT has authentication error: Invalid authentication credential sacct: error: slurm_persist_conn_open: failed to send persistent connection init message to localhost:6819 sacct: error: Sending PersistInit msg: Protocol authentication error sacct: error: Problem talking to the database: Protocol authentication error