| Summary: | Mixing host operating systems in cluster | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | sarah.summers |
| Component: | Other | Assignee: | Marcin Stolarek <cinek> |
| Status: | RESOLVED INFOGIVEN | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | CC: | cinek |
| Version: | 23.02.3 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | STFC UKR | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | Target Release: | --- | |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
|
Description
sarah.summers
2023-09-27 05:27:15 MDT
Sarah, >Is it possible to have hosts with different OS in the cluster or will this be problematic. This shouldn't be an issue, but the binaries slurmd/slurmstepd has to be build on the same OS. This comes from the fact that on newer OS we're linking to other libraries, in different locations etc. and the fact that depending on the features provided by those final source code may differ. Just to give you one example CentOS7 and Rocky 9 are likely using different major version of hwloc, which results in different code being used on Slurm side[1], for instance library decided to change the function prototype: >#if HWLOC_API_VERSION >= 0x00020000 > return hwloc_topology_export_xml(topology, hwloc_xml, 0); >#else > return hwloc_topology_export_xml(topology, hwloc_xml); >#endif cheers, Marcin [1]https://github.com/SchedMD/slurm/blob/slurm-23-02-5-1/src/slurmd/common/xcpuinfo.c#L161-L165 Hi Marcin, Thanks for the information. Just to clarify, for a host with CentOS installed the Slurm rpms which are installed must have been built on an equivalent CentOS host; and for a Rocky 9 host the Slurm rpms installed must have been built on a Rocky 9 host. Is that correct? As you only mentioned slurmd/slurmstepd am I correct in thinking that it is OK for the Slurm controller and Slurm database nodes to be running CentOS 7 with some compute running Rocky 9? Thanks. Kind regards, Sarah >Just to clarify, for a host with CentOS installed the Slurm rpms which are installed must have been built on an equivalent CentOS host; and for a Rocky 9 host the Slurm rpms installed must have been built on a Rocky 9 host. Is that correct? Yes. >As you only mentioned slurmd/slurmstepd am I correct in thinking that it is OK for the Slurm controller and Slurm database nodes to be running CentOS 7 with some compute running Rocky 9? Yes - it should be fine, as long as the binaries are build on the same OS where used. Let me know if you have any further questions. cheers, Marcin Hi Marcin, Thanks for all of the information, it is very helpful. Please feel free to close the ticket. Kind regards, Sarah |