| Summary: | reservation for group root | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Yann <yann.sagon> |
| Component: | reservations | Assignee: | Carlos Tripiana Montes <tripiana> |
| Status: | RESOLVED FIXED | QA Contact: | |
| Severity: | 4 - Minor Issue | ||
| Priority: | --- | ||
| Version: | 22.05.8 | ||
| Hardware: | Linux | ||
| OS: | Linux | ||
| Site: | Université de Genève | Alineos Sites: | --- |
| Atos/Eviden Sites: | --- | Confidential Site: | --- |
| Coreweave sites: | --- | Cray Sites: | --- |
| DS9 clusters: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Linux Distro: | --- |
| Machine Name: | CLE Version: | ||
| Version Fixed: | 23.02.3, 23.11.0rc1 | Target Release: | --- |
| DevPrio: | --- | Emory-Cloud Sites: | --- |
|
Description
Yann
2023-03-27 05:57:54 MDT
Hi, I have tested and got it to work if I set user=root when creating the reservation instead of group. What command are you using to create the reservation? Thanks Hi, yes I know it is working with user=root, but.. we want to use groups=root,hpc_admin because we don't want to enumerate all the users in this group. Is there an issue to do so? Can you please provide the exact command you have used to create the reservation? I am getting an error on my end when trying to create one with group=root. Thanks (baobab)-[root@admin1 ~]$ NODE=cpu001
(baobab)-[root@admin1 ~]$ scontrol create \
> Reservation="installation_${NODE}" \
> StartTime=now \
> Duration=0-06:00:00 \
> Groups=root,hpc_admin \
> Flags=MAINT,IGNORE_JOBS,OVERLAP \
> Nodes=${NODE}
Reservation created: installation_cpu001
Maybe a typo in your end: it is groupS, not group. Thanks, let me run some tests and get back to you shortly. Thanks Hi, Thanks for your patience. After running a few tests, we beleive the behavior you are experiencing is a bug and I am looking into a resolution at the moment. I will leave this ticket until we have a solution and will keep you posted on updates. Hello,
Thanks again for your patience. I have been working on this problem since we last spoke and discovered that the following command works:
$ scontrol create reservation account=root starttime=now duration=infinite nodes=z1
Reservation created: root_19
and attempting to run a job as user root under this reservation works as well:
root@benny-ThinkPad-T14-Gen-3:/home/benny# sbatch --reservation=root_19 --wrap="sleep 2m"
Submitted batch job 320
root@benny-ThinkPad-T14-Gen-3:/home/benny# squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
320 debug wrap root R 0:02 1 z1
Would this be a possible solution for you?
To view your current account associations you can perform the following command:
$ sacctmgr show assoc
Please let me know if this can work?
Thanks
Hello, I'm currently in vacation until 17th of April 2023. If needed you can contact my colleagues from HPC team at hpc@unige.ch or open a ticket on dw.unige.ch Best wishes [Logo UNIGE] Yann Sagon Référent HPC Division du système et des technologies de l'information et de la communication Université de Genève | 66, Boulevard Carl-Vogt | 1205 Genève Tél 022 379 77 37 | Bureau D605 www.unige.ch/stic <http://www.unige.ch/stic> Hi Benny, this workaround should indeed work, thanks. You can close the issue please. Best Yann Hi, Thanks for letting me know, I will leave the ticket open for now because we are trying to resolve the group=root bug through this ticket. I'm glad you got it to work and once the original problem is fixed, we will inform you in this ticket. Regards Hi Yann, Even though the workaround provided did the trick for you, we wanted to get this fixed in the long term. We put the fix in 23.02 branch, so it will eventually become available when 23.02.3 gets released. Commits: * 662011a49c (HEAD -> slurm-23.02, origin/slurm-23.02) Merge branch 'bug16371' into slurm-23.02 |\ | * 1fffbd40a0 Add NEWS | * 338f5035b7 slurmctld/groups - reverse order for UID array | * 368b3895f9 slurmctld/groups - fix root ID (0) being ignored | * 94c927373d slurmctld/groups - remove non-implemented declaration from header |/ We landed yet another commit, only for master branch (future 23.11), regarding this bug. This just enables root/SlurmUser to send jobs to any reservation, even it they're not allowed. Commit: 89dae4fdc0 (HEAD -> master, origin/master, origin/HEAD) Allow SlurmUser/root to use reservations without specific permissions So, all in all, we're now closing this bug, but it will be as resolved/fixed, rather than infogiven. Have a good day, Carlos. Many thanks! |