| Summary: | EpilogSlurmctld sudo error | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Alex Mamach <mamacha> |
| Component: | Other | Assignee: | Jason Booth <jbooth> |
| Status: | RESOLVED INFOGIVEN | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | ||
| Version: | 22.05.6 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | Memorial Sloan Kettering Cancer Center | Slinky Site: | --- |
| Alineos Sites: | --- | Atos/Eviden Sites: | --- |
| Confidential Site: | --- | Coreweave sites: | --- |
| Cray Sites: | --- | DS9 clusters: | --- |
| Google sites: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | NoveTech Sites: | --- |
| Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Tzag Elita Sites: | --- |
| Linux Distro: | --- | Machine Name: | |
| CLE Version: | Version Fixed: | ||
| Target Release: | --- | DevPrio: | --- |
| Emory-Cloud Sites: | --- | ||
| Attachments: | slurm.conf | ||
|
Description
Alex Mamach
2023-06-22 14:52:49 MDT
Please attach your slurm.conf Slurm has SlurmUser and SlurmdUser. > SlurmUser = slurm(1000) > SlurmdUser = root(0) The epilog should run as the SlurmdUser which should be root on all systems except with some exotic configurations. https://slurm.schedmd.com/prolog_epilog.html You might want to put in some debugging in the Epilog that runs "whoami" just to see which user this is and then confirm if that user has sudo nopasswd access. By default, root should be able to do this without a password prompt from the command. Created attachment 30922 [details]
slurm.conf
Hi Jason, Thanks for your response! Please find our slurm.conf attached. I tried logging "whoami" in the epilog and it looks like this is running as user Slurm. I tried changing PrologSlurmctld and EpilogSlurmctld to simply "Prolog" and "Epilog," but I then see the Prolog script fail on start and the nodes drain. If there are any Slurm-side config changes that would help here, please let me know. Thank you for uploading that information. > Interestingly this doesn't seem to happen with the sudo commands in the prolog.. Can you confirm if the user slurm is part of the sudoers or if it has an entry in the sudoers file? > slurm ALL=(ALL) NOPASSWD:/path/to/command1 Hi Jason, Thanks for your help, I was able to solve the issue. It turns out I just needed to change to Prolog/Epilog from SlurmctldProlog/SlurmctldEpilog; the errors were due to a PATH difference in binaries being called by slurm vs root. Thanks! |