Ticket 7770

Summary: Request to change default short name of node idle state
Product: Slurm Reporter: Paul Peltz <peltzpl>
Component: User CommandsAssignee: Unassigned Developer <dev-unassigned>
Status: OPEN --- QA Contact:
Severity: 5 - Enhancement    
Priority: --- CC: albert.gil, kilian, maciej.cytowski, marshall, sts
Version: 20.02.x   
Hardware: Linux   
OS: Linux   
See Also: https://bugs.schedmd.com/show_bug.cgi?id=9869
Site: ORNL-OLCF Slinky Site: ---
Alineos Sites: --- Atos/Eviden Sites: ---
Confidential Site: --- Coreweave sites: ---
Cray Sites: --- DS9 clusters: ---
Google sites: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- NoveTech Sites: ---
Nvidia HWinf-CS Sites: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Tzag Elita Sites: ---
Linux Distro: --- Machine Name:
CLE Version: Version Fixed:
Target Release: --- DevPrio: ---
Emory-Cloud Sites: ---

Description Paul Peltz 2019-09-18 16:27:59 MDT
We'd like to see some improvements to the output of commands such as sinfo so that node state short names are more obvious without having to reference special character to substate meanings. E.g.

> sinfo
PARTITION AVAIL TIMELIMIT NODES STATE   NODELIST
batch     up     infinite     2 alloc   adev[8-9]
batch     up     infinite     5 idle*   adev[10-14]
batch     up     infinite     1 idle#   adev[15]


> sinfo
PARTITION AVAIL TIMELIMIT NODES STATE             NODELIST
batch     up     infinite     2 alloc             adev[8-9]
batch     up     infinite     5 idle_noresponse   adev[10-14]
batch     up     infinite     1 idle_rebooting    adev[15]

There is most likely a better way to present this, but I wanted to just show an example of how this would be more easily presented to users and admins. Another possibility is to add an additional column.

> sinfo
PARTITION AVAIL TIMELIMIT NODES STATE    SUBSTATE         NODELIST
batch     up     infinite     2 alloc    running          adev[8-9]
batch     up     infinite     5 idle     noresponse       adev[10-14]
batch     up     infinite     1 idle     rebooting        adev[15]
batch     up     infinite     1 idle     preallocated     adev[16]

Thanks,

Paul
Comment 2 Kilian Cavalotti 2019-09-18 19:48:26 MDT
I'd like to support that suggestion, and I like the idea of a substate.

Jobs have a "reason" (defined as: "The reason a job is in its current state"), maybe nodes could have the same thing?

Cheers,
-- 
Kilian
Comment 4 Jason Booth 2019-10-04 13:46:07 MDT
*** Ticket 7860 has been marked as a duplicate of this ticket. ***
Comment 5 Tim Wickberg 2020-07-27 12:42:21 MDT
Changing the output format is difficult, as - especially for sinfo - there are a lot of sites that have scripted against the command and are relying on that remaining stable.

The idea of separating the "reason" (or "substate") field off into a separately parse-able field does have merit, and I'd be willing to consider that if a site is interested in sponsoring that for the 21.08 release. It'd also entail a decent amount of work ensuring that is being tracked and managed internally efficiently.

- Tim