Ticket 11099

Summary: RPM build fails of 20.11.5
Product: Slurm Reporter: Ward Poelmans <ward.poelmans>
Component: Build System and PackagingAssignee: Tim McMullan <mcmullan>
Status: RESOLVED FIXED QA Contact:
Severity: 4 - Minor Issue    
Priority: --- CC: nate
Version: 20.11.4   
Hardware: Linux   
OS: Linux   
Site: VUB Alineos Sites: ---
Atos/Eviden Sites: --- Confidential Site: ---
Coreweave sites: --- Cray Sites: ---
DS9 clusters: --- HPCnow Sites: ---
HPE Sites: --- IBM Sites: ---
NOAA SIte: --- OCF Sites: ---
Recursion Pharma Sites: --- SFW Sites: ---
SNIC sites: --- Linux Distro: CentOS
Machine Name: CLE Version:
Version Fixed: 20.11.5 Target Release: ---
DevPrio: --- Emory-Cloud Sites: ---

Description Ward Poelmans 2021-03-16 04:10:37 MDT
The branch slurm-20.11 has some issue with the RPM spec file:

- prolog.example and job_submit.lua.example are not installed but they are part of the %files section. Fix:

@@ -411,6 +411,8 @@ install -D -m644 etc/slurmrestd.service  %{buildroot}/%{_unitdir}/slurmrestd.ser
 
 install -D -m644 etc/cgroup.conf.example %{buildroot}/%{_sysconfdir}/cgroup.conf.example
 install -D -m644 etc/slurm.conf.example %{buildroot}/%{_sysconfdir}/slurm.conf.example
+install -D -m644 etc/prolog.example %{buildroot}/%{_sysconfdir}/prolog.example
+install -D -m644 etc/job_submit.lua.example %{buildroot}/%{_sysconfdir}/job_submit.lua.example
 install -D -m600 etc/slurmdbd.conf.example %{buildroot}/%{_sysconfdir}/slurmdbd.conf.example
 install -D -m644 etc/cli_filter.lua.example %{buildroot}/%{_sysconfdir}/cli_filter.lua.example
 install -D -m755 contribs/sjstat %{buildroot}/%{_bindir}/sjstat

- the new tmp plugin has a config file 'namespace.conf'. It tries to install a man page for it to /usr/share/man/man5/namespace.conf.5.gz but that fails as pam (on CentOS 7.9 at least) has already installed that file.  It will need a new name.


Both errors suggest that building and installing slurm as RPM are not part of the automatic testing. Could that be added to avoid this?
Comment 8 Tim McMullan 2021-03-16 13:49:29 MDT
Hi Ward,

Thank you for reporting this!  We have pushed fixes to both issues and they will be included in the 20.11.5 release.  It is worth noting that as of these changes, "namespace.conf" is now "job_container.conf", so if you were testing this without using RPMs you will need to rename that configuration file.

I think it makes sense to test building/installing from the spec file internally to help catch these issues faster, and we are working on adding tests to catch situations like this faster!

Thanks again, and please let us know if there are any other issues!
--Tim
Comment 9 Ward Poelmans 2021-03-17 01:57:14 MDT
Thanks for the quick fix.

If you're interested, I can give you a github action file that will do these tests for you?