Summary: | sbatch error with SPANK plugin: Plugin file not found | ||
---|---|---|---|
Product: | Slurm | Reporter: | Ole.H.Nielsen <Ole.H.Nielsen> |
Component: | Configuration | Assignee: | Oriol Vilarrubi <jvilarru> |
Status: | RESOLVED FIXED | QA Contact: | Ben Roberts <ben> |
Severity: | 4 - Minor Issue | ||
Priority: | --- | CC: | marshall |
Version: | 21.08.8 | ||
Hardware: | Linux | ||
OS: | Linux | ||
Site: | DTU Physics | Alineos Sites: | --- |
Atos/Eviden Sites: | --- | Confidential Site: | --- |
Coreweave sites: | --- | Cray Sites: | --- |
DS9 clusters: | --- | HPCnow Sites: | --- |
HPE Sites: | --- | IBM Sites: | --- |
NOAA SIte: | --- | NoveTech Sites: | --- |
Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
Recursion Pharma Sites: | --- | SFW Sites: | --- |
SNIC sites: | --- | Linux Distro: | --- |
Machine Name: | CLE Version: | ||
Version Fixed: | 23.02 | Target Release: | --- |
DevPrio: | --- | Emory-Cloud Sites: | --- |
Description
Ole.H.Nielsen@fysik.dtu.dk
2022-07-06 05:51:50 MDT
Hello Ole, In the slurmctld node the SPANK libraries are not needed, only on the compute nodes (slurmd and slurmstepd daemons) and in the machines that will execute the various submission commands as srun, sbatch, etc... This information is found here https://slurm.schedmd.com/spank.html#SECTION_SPANK-PLUGINS. In the local and allocator context it can be seen that it is loaded by srun, sbatch, salloc etc... (those would be the login nodes) and in remote ,slurmd and job_script it states that slurmstepd and slurmd load this plugin, even though in slurmd it is not specifically said(this would be the compute nodes). But we will consider adding a note stating that the required SPANK plugins need to be present on the compute nodes as well as in the nodes where the user commands will be executed, in order to make things clearer. Also maybe this sentence (in CONFIGURATION section [https://slurm.schedmd.com/spank.html#SECTION_CONFIGURATION]) was not clear that it will also make the user commands fail if a required SPANK plugin is not found: > If a SPANK plugin is required, then failure of any of the plugin's functions will cause slurmd to terminate the job We will also try to rephrase this sentence to make it clearer that the user commands will also be affected in case of a missing SPANK library. Regards. Hi Oriol, Thanks for the info: (In reply to Oriol Vilarrubi from comment #2) > In the slurmctld node the SPANK libraries are not needed, only on the > compute nodes (slurmd and slurmstepd daemons) and in the machines that will > execute the various submission commands as srun, sbatch, etc... > > This information is found here > https://slurm.schedmd.com/spank.html#SECTION_SPANK-PLUGINS. In the local and > allocator context it can be seen that it is loaded by srun, sbatch, salloc > etc... (those would be the login nodes) and in remote ,slurmd and job_script > it states that slurmstepd and slurmd load this plugin, even though in slurmd > it is not specifically said(this would be the compute nodes). > > But we will consider adding a note stating that the required SPANK plugins > need to be present on the compute nodes as well as in the nodes where the > user commands will be executed, in order to make things clearer. Thanks, precise and complete documentation will be much appreciated. > Also maybe this sentence (in CONFIGURATION section > [https://slurm.schedmd.com/spank.html#SECTION_CONFIGURATION]) was not clear > that it will also make the user commands fail if a required SPANK plugin is > not found: > > If a SPANK plugin is required, then failure of any of the plugin's functions will cause slurmd to terminate the job > We will also try to rephrase this sentence to make it clearer that the user > commands will also be affected in case of a missing SPANK library. Yes, this could also do with a bit of clarification so that sites don't make the same mistake that I did. Will you update me when you have decided on improved documentation? Best regards, Ole Hi Ole Yes, I will discuss this internally and I'll come back to you. Regards. Hello Ole, We've modified the documentation to include a note about where are the SPANK plugins needed. we've also rephrased the sentence we talked about in Comment 2 to also make a reference to the job allocation commands. You can see it in commit bfad62d1 [1] I'm closing this bug as fixed, do not hesitate to reopen it if needed. Regards. [1] https://github.com/SchedMD/slurm/commit/bfad62d1 |