| Summary: | sinfo -t does not report the good nodes states | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Regine Gaudin <regine.gaudin> |
| Component: | User Commands | Assignee: | Ben Roberts <ben> |
| Status: | RESOLVED INFOGIVEN | QA Contact: | |
| Severity: | 3 - Medium Impact | ||
| Priority: | --- | ||
| Version: | 20.11.8 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | CEA | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | Target Release: | --- | |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
|
Description
Regine Gaudin
2022-05-02 10:01:00 MDT
Hi Regine, This is the expected behavior, though I can understand the confusion. There is a base node state and there can be additional state flags that are added to a node. Taking DRAIN as an example, when a node is fully drained (meaning all jobs have completed) then the node is in an IDLE state, but it also has the DRAIN flag to show that it shouldn't receive any more jobs. So when you request nodes in the IDLE state with sinfo it will show all nodes that include the IDLE state, but they may have additional flags that affect how they are displayed. This is due to a change in 20.11 that allows you to filter by multiple states (bug 9723). Here's an example that may help illustrate how that works. If I request sinfo to show me the IDLE nodes I will get ones that appear as 'idle' and 'drain'. $ sinfo -pdebug -tidle PARTITION AVAIL TIMELIMIT NODES STATE NODELIST debug* up infinite 1 drain node10 debug* up infinite 17 idle node[01-09,11-18] I can request only nodes that include the IDLE and DRAIN flags, which will have the same effect as when you requested just nodes with the DRAIN state. $ sinfo -pdebug -t'idle&drain' PARTITION AVAIL TIMELIMIT NODES STATE NODELIST debug* up infinite 1 drain node10 You an also use the scontrol show node output to see how the state appears on the node. $ scontrol show nodes node10 | grep State State=IDLE+DRAIN ThreadsPerCore=2 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A The multiple states were present before 20.11 as well, but the lack of ability to filter by the state flag made the information returned by sinfo behave differently. There was also additional functionality added in 21.08 where you can request the sinfo output in json or yaml format. This would allow you to script something that looks at the state information and returns just the nodes that are idle with no additional state flag. It sounds like you just upgraded to 20.11 though, so going to 21.08 may not be something you can do immediately. Let me know if you have any additional questions about this. Thanks, Ben Hi Regine, I wanted to follow up and make sure you don't have additional questions about this. Let me know if there is anything else I can do to help. Thanks, Ben Hi Regine, I believe the information I sent about the node states should have answered your questions. I haven't heard any follow up questions so I'll go ahead and close this ticket. If you do have a follow up questions feel free to update the ticket and I'll respond. Thanks, Ben |