Created attachment 9900 [details] Patch to fix reference to MAX_AGENT_CNT in sdiag man page Hi there, When we were trying to chase down this mornings outage on Cori I noticed that the sdiag manual page says: Agent queue size [...] If this values is close to MAX_AGENT_CNT there could be some delays affecting jobs management. It appears that commit 53534f4907c0333696d2a04046c52a92a5e39c40 removed MAX_AGENT_CNT and replaced its use with MAX_SERVER_THREAD back in 2015. I'm guessing the gist of the paragraph is still correct, so I'll attach a patch to just swap the preprocessor marco name over. All the best, Chris
Hi Chris, Thank you for the patch. We will review and let you know if anything else will be changed.
Hi Chris, While MAX_AGENT_CNT is indeed deprecated and the man page needs to be amended, I think your contribution is not correct. I've triggered the review process for a different patch. We'll keep you updated. Thanks.
(In reply to Alejandro Sanchez from comment #4) > Hi Chris, Hiya! > While MAX_AGENT_CNT is indeed deprecated and the man page needs to be > amended, I think your contribution is not correct. I've triggered the review > process for a different patch. We'll keep you updated. Not a problem! Thanks for this. All the best, Chris
Hi Chris, sdiag docs have been clarified in the following commit: https://github.com/SchedMD/slurm/commit/cbfb66807416df273f2fe7f43a23fb0511a8dae0 In 20.02 I've also exposed the number of agent threads to sdiag as a new stat. I'm closing this bug. Plese, reopen if you have further questions. Thanks.
Thanks Alejandro!