| Summary: | Kubernetes/Docker Questions | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Hermann Schwärzler <hermann.schwaerzler> |
| Component: | Other | Assignee: | Nate Rini <nate> |
| Status: | RESOLVED FIXED | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | CC: | nate |
| Version: | 20.11.3 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| See Also: | https://bugs.schedmd.com/show_bug.cgi?id=13301 | ||
| Site: | Innsbruck | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | 21.08pre1 | Target Release: | --- |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
(In reply to Hermann Schwärzler from comment #0) > We are currently trying to create a small HPC Mockup with a virtual Slurm > cluster running on Kubernetes as container manager. The idea is to give our > users the possibility to get to know the basics, play around, develop or > allow teaching before moving to a live cluster. We use the following docker-compose cluster for our training courses but anyone can run it: > https://gitlab.com/SchedMD/training/docker-scale-out Please make sure to follow the instructions in README if you do try it. > For the first tests we started with a rudimentary cluster in docker. This > worked so far but in order to start slurmrestd we needed to deactivate > security for the docker run ("--security-opt seccomp=unconfined") as > otherwise we get the following error: > “slurmrestd fatal: Unable to unshare System V namespace: Operation not > permitted”. > > As a result of this first tests we have two questions: > > * We were wondering why this namespace operation is required by the REST > demon? Maybe you can shed some light onto this? Slurm doesn't use SysV namespace so we fork into a private namespace following the principle of least privilege. It is not required but would require a minor code change to deactivate: > diff --git a/src/slurmrestd/slurmrestd.c b/src/slurmrestd/slurmrestd.c > index 6a0f24d..f529ccb 100644 > --- a/src/slurmrestd/slurmrestd.c > +++ b/src/slurmrestd/slurmrestd.c > @@ -283,6 +283,8 @@ static void _parse_commandline(int argc, char **argv) > */ > static void _lock_down(void) > { > + return; > + > if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0) == -1) > fatal("Unable to disable new privileges: %m"); > if (unshare(CLONE_SYSVSEM)) What is your kernel version? I have not seen anyone have this issue yet. > * Do you have a best practice for deploying a Slurm cluster on Kubernetes or > even a HELM chart that you can share and we can build on? Not currently. There are some projects on github with helm charts for Slurm but they are not directly supported by SchedMD. I would expect Slurm to be able to run under k8s without issue as long as none of the cgroup plugins are configured. Hermann, There are have been no more questions in this ticket for a while. We are going to close this ticket but please reply if you have any more questions. Thanks, --Nate Hey Nate, sorry for the late reply! Thanks for your help and the example on gitlab! It was/is very helpful for testing and debugging. Just "for the record" :-) In your github-example you do not run into the "Unable to unshare System V namespace"-problem because you are using "seccomp=unconfined" in your docker-compose. - If you just deactivate this setting for the rest container (line 1814 in the version I cloned from gitlab) you can still use it without any difficulties. - If you deactivate it for every container you can still use it without any difficulties. - If you start a single container from the "scaleout" image you get the same error as I do. I tried this with: Fedora 33 - kernel is 5.10.16-200.fc33.x86_64 CentOS 8 - kernel is 4.18.0-240.10.1.el8_3.x86_64 Kind regards, Hermann Hermann, We have added a new env option to slurmrestd: > https://github.com/SchedMD/slurm/commit/3b7d082d11416c1b48ca13f5d3310d71d1ea150f In Slurm-21.08, it will be possible to pass this environmental variable to disable the sysv unshare: > env SLURMRESTD_SECURITY=disable_unshare_sysv slurmrestd I'm going to close this ticket but please respond if you have any questions. Thanks, --Nate |
Hi Everyboy! We are currently trying to create a small HPC Mockup with a virtual Slurm cluster running on Kubernetes as container manager. The idea is to give our users the possibility to get to know the basics, play around, develop or allow teaching before moving to a live cluster. For the first tests we started with a rudimentary cluster in docker. This worked so far but in order to start slurmrestd we needed to deactivate security for the docker run ("--security-opt seccomp=unconfined") as otherwise we get the following error: “slurmrestd fatal: Unable to unshare System V namespace: Operation not permitted”. As a result of this first tests we have two questions: * We were wondering why this namespace operation is required by the REST demon? Maybe you can shed some light onto this? * Do you have a best practice for deploying a Slurm cluster on Kubernetes or even a HELM chart that you can share and we can build on? Kind regards, Hermann