Hello, I'm continuing to setup the cori slurm instances. Normal NERSC hostname/networking protocol is to have a hostname, e.g., cori06, accessible via gethostname(), and then separate DNS entries/hostnames for each network interface, e.g., cori06-ib and cori06-244 identifying particular network interfaces. Anyway, this means to get routes between the various slurm daemons correct, I am making heavy use of the Addr configuration options (ControlAddr, HostAddr, etc) In this context, I noticed that DbdBackupHost exists, but not DbdBackupAddr? For the case of routing traffic from the inside of cori to the outside (where slurmdbd runs), it will be necessary to have DbdBackupAddr unless I do something ugly to the configuration files. However, for other instances not specifying this will cause slurm traffic to cross over interfaces I'd prefer it not to. Thus, my question: it seems that DbdBackupAddr is missing, or am I missing something? Thanks, Doug
You'll want to take a look at Bug 1921. From Comment 6: "A DbdBackupHostAddr is not needed. DbdBackupHost is only used to verify the host that the backup is running on. The backup uses DbdAddr to communicate with the primary dbd. The controller, sacct and sacctmgr use AccountingStorageHost and AccountingStorageBackupHost in the slurm.conf to know where the dbds are running." You may want the the DbdAddr fix in (Comment 3): https://github.com/SchedMD/slurm/commit/5beb84db5af9ac73657d555ca672d711dc4eda60 Does this help? Thanks, Brian
Yes, this does help. I guess I misunderstood what the DbdBackupHost was for. Thank you.