| Summary: | Seeing "Could not open job state file" message, warns "Jobs may be lost!" | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Will Dennis <wdennis> |
| Component: | slurmctld | Assignee: | Jason Booth <jbooth> |
| Status: | RESOLVED INFOGIVEN | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | ||
| Version: | 20.11.5 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | NEC Labs | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | Ubuntu |
| Machine Name: | ma-slurm-ctlr | CLE Version: | |
| Version Fixed: | Target Release: | --- | |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
|
Description
Will Dennis
2021-05-05 13:45:36 MDT
Will - if this is the first time the scheduler has started and shutdown then what you are seeing is normal. Slurm will write out state information to the StateSaveLocation on shutdown. This includes information about jobs, partitions, nodes, associations, clustername, federation, database messages, config state, tres, qos, priority, reservations and triggers. Since this is the first time the cluster has started it will not contain the state information for the cluster until the first shutdown. At this point, it will write this information out to the StateSaveLocation. It is the first time, but since this message is new to me, wanted to check it out. You may go ahead and close, thanks! |