Summary: | sreport reports inconsistent powered_down (suspended) nodes | ||
---|---|---|---|
Product: | Slurm | Reporter: | Ole.H.Nielsen <Ole.H.Nielsen> |
Component: | User Commands | Assignee: | Megan Dahl <megan> |
Status: | RESOLVED FIXED | QA Contact: | |
Severity: | 4 - Minor Issue | ||
Priority: | --- | CC: | megan |
Version: | 23.02.4 | ||
Hardware: | Linux | ||
OS: | Linux | ||
Site: | DTU Physics | Alineos Sites: | --- |
Atos/Eviden Sites: | --- | Confidential Site: | --- |
Coreweave sites: | --- | Cray Sites: | --- |
DS9 clusters: | --- | HPCnow Sites: | --- |
HPE Sites: | --- | IBM Sites: | --- |
NOAA SIte: | --- | NoveTech Sites: | --- |
Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
Recursion Pharma Sites: | --- | SFW Sites: | --- |
SNIC sites: | --- | Linux Distro: | --- |
Machine Name: | CLE Version: | ||
Version Fixed: | 23.11.x | Target Release: | --- |
DevPrio: | --- | Emory-Cloud Sites: | --- |
Description
Ole.H.Nielsen@fysik.dtu.dk
2023-09-13 19:51:57 MDT
Hello Ole, I will need some time to look into this, but from first glance sreport’s PLND Down field calculation does not include the time that non-CLOUD nodes are in the POWERED_DOWN state. I will report back when I have more information. https://slurm.schedmd.com/sreport.html#OPT_cluster-Utilization Thanks, --Megan Hello Ole, After discussing this with Brian, I’ll go ahead and work on including non-cloud nodes in the calculation. I’ll keep you updated on the patch’s progress. Regards, --Megan Hello Ole, The time that all nodes are in the powered_down state will now be included in sreport’s PLND Down field instead of it only applying to cloud nodes. The change can be found in the following commit: commit d021731cbf366859bb98bf3232463545916d6d40 Author: Megan Dahl <megan@schedmd.com> Date: Mon Sep 25 10:52:16 2023 -0600 sreport PlannedDown field includes the time all nodes were POWERED_DOWN PlannedDown used to only include the time nodes were POWERED_DOWN if they were cloud nodes. However, it is desirable to see statistics on all POWERED_DOWN nodes. Bug 17689 This will be available in 23.11. Regards, --Megan Hi Megan, Thanks a lot for creating this patch: (In reply to Megan Dahl from comment #5) > The time that all nodes are in the powered_down state will now be included > in sreport’s PLND Down field instead of it only applying to cloud nodes. The > change can be found in the following commit: > commit d021731cbf366859bb98bf3232463545916d6d40 > Author: Megan Dahl <megan@schedmd.com> > Date: Mon Sep 25 10:52:16 2023 -0600 > > sreport PlannedDown field includes the time all nodes were POWERED_DOWN > > PlannedDown used to only include the time nodes were POWERED_DOWN if > they were cloud nodes. However, it is desirable to see statistics on all > POWERED_DOWN nodes. > > Bug 17689 > > This will be available in 23.11. We probably won't be upgrading to 23.11 until after a few minor releases of 23.11. Is there any chance that the patch can make it into 23.02? Thanks, Ole Hi Ole, Unfortunately, since this is a functional change that is user visible it can not be added 23.02. The purpose of this is to avoid breaking maintenance releases. Regards, --Megan |