| Summary: | sinfo --cluster returns "an unknown select plugin_id 108" error | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Brian F Gilmer <brian.gilmer> |
| Component: | Configuration | Assignee: | Dominik Bartkiewicz <bart> |
| Status: | RESOLVED FIXED | QA Contact: | |
| Severity: | 3 - Medium Impact | ||
| Priority: | --- | CC: | bart, da, fabrice.cantos |
| Version: | 17.02.9 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | CRAY | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | Other |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | Kupe | CLE Version: | |
| Version Fixed: | 17.11.0-rc4 | Target Release: | --- |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
Changing component from "Federation" to "Configuration". Someone should pursue this more on Thursday. Thanks, The customer has a workflow manager that runs jobs on both the XC ans CS systems. They were relying on the --cluster feature. Hi For now I found only workaround. Can you add other_cons_res to SelectTypeParameters on ec-login01. If you doesn't use select_alps this shouldn't change any thing. Dominik just in case: if you use on other machines select_cray with select_linear this workaround is wrong. Hello I modified slurm.conf on kupe_mp (VM cluster). I added SelectTypeParameter=other_cons_res as a wrok-around for this problem. Hi Commit https://github.com/SchedMD/slurm/commit/d3338956fe9 should fix this issue. It will be included in 17.11 release. Could you confirm if this solved problem on your environment? Dominik Hi I'm going to go ahead and mark this as Resolved/Fixed, please feel free to re-open this if there's anything else we can help with. Dominik |
The site has an XC (kupe) and 2 VM cluster (kupe_mp and kupe_librarian). The slurmdbd is running outside of the XC mainframe. When trying to access sinfo from a 'login' host I get: [root@ec-login01 munge]# sinfo --cluster=kupe sinfo: error: Cluster 'kupe' has an unknown select plugin_id 108 sinfo: error: 'kupe' can't be reached now, or it is an invalid entry for --cluster. Use 'sacctmgr list clusters' to see available clusters. [root@ec-login01 munge]# sacctmgr show cluster Cluster ControlHost ControlPort RPC Share GrpJobs GrpTRES GrpSubmit MaxJobs MaxTRES MaxSubmit MaxWall QOS Def QOS ---------- --------------- ------------ ----- --------- ------- ------------- --------- ------- ------------- --------- ----------- -------------------- --------- kupe 192.168.235.165 6817 7936 1 normal kupe_libr+ 0 0 1 normal kupe_mp 10.64.125.139 6817 7936 1 normal The select plugin ID is for SELECT_PLUGIN_CRAY_CONS_RES. Since that is not really a plugin it is not getting picked up by looking at the available select plugins.