| Summary: | Add PGID task plugin for FreeBSD | ||
|---|---|---|---|
| Product: | Slurm | Reporter: | Rikka Göring <rikka.goering> |
| Component: | Other | Assignee: | Tim McMullan <mcmullan> |
| Status: | OPEN --- | QA Contact: | |
| Severity: | C - Contributions | ||
| Priority: | --- | CC: | mcmullan |
| Version: | 23.11.7 | ||
| Hardware: | Other | ||
| OS: | Other | ||
| Site: | -Other- | Slinky Site: | --- |
| Alineos Sites: | --- | Atos/Eviden Sites: | --- |
| Confidential Site: | --- | Coreweave sites: | --- |
| Cray Sites: | --- | DS9 clusters: | --- |
| Google sites: | --- | HPCnow Sites: | --- |
| HPE Sites: | --- | IBM Sites: | --- |
| NOAA SIte: | --- | NoveTech Sites: | --- |
| Nvidia HWinf-CS Sites: | --- | OCF Sites: | --- |
| Recursion Pharma Sites: | --- | SFW Sites: | --- |
| SNIC sites: | --- | Tzag Elita Sites: | --- |
| Linux Distro: | --- | Machine Name: | |
| CLE Version: | Version Fixed: | ||
| Target Release: | --- | DevPrio: | --- |
| Emory-Cloud Sites: | --- | ||
| Attachments: |
Adds a "pgid" task plugin for FreeBSD that manages job steps using process groups. Provides an alternative to Linux cgroups for resource tracking.
23659 25.11 v2 |
||
Following up on this submission. If more detail, logs, or updates against current master would be useful, I’m happy to provide them. Thanks for looking into this. Created attachment 43174 [details]
23659 25.11 v2
I spent some time looking at this and in the process started to clean it up a bit so I had a better shot of seeing what it is doing for FreeBSD.
I'm still looking into it a bit, but will keep you up to date!
Thanks!
--Tim
Thanks, Tim - much appreciated!
I’ve pulled attachment 43174 [details] (your v2 of the patch) and will validate it on FreeBSD 14.3 across a few environments (bare metal, jail, and poudriere jail) with Slurm 25.11-rc.
If I hit anything FreeBSD-specific (e.g., setpgid vs setpgrp, killpg, jail process visibility, kinfo_proc quirks), I’ll report with logs and a minimal reproducer. Otherwise I’ll post results and any small nits.
Thanks again for picking this up!
Thanks a lot again for the updated patch! It may take me a few days before I can give proper testing feedback. In order to reliably test this version, I first need to finish upgrading the FreeBSD port sysutils/slurm-wlm, which I’m currently in the middle of. Sorry for the delay — I’ll get back with results as soon as I have the port upgrade in place and can test the patch under real conditions. (In reply to Rikka Göring from comment #4) > Thanks a lot again for the updated patch! > It may take me a few days before I can give proper testing feedback. In > order to reliably test this version, I first need to finish upgrading the > FreeBSD port sysutils/slurm-wlm, which I’m currently in the middle of. > > Sorry for the delay — I’ll get back with results as soon as I have the port > upgrade in place and can test the patch under real conditions. No problem, thanks for the update and for testing things! I mentioned in the previous ticket that in 24.05+ you should be able to disable cgroups without any patches by just adding an option in cgroup.conf - Since that comment I made what should be another improvement for 25.11+ (https://github.com/SchedMD/slurm/commit/cbcc850de7) where you shouldn't have to make a cgroup.conf file at all and it will simply disable the plugin on startup. I hope this helps with future updates to the FreeBSD port! Thanks again! --Tim (In reply to Tim McMullan from comment #5) > (In reply to Rikka Göring from comment #4) > > Thanks a lot again for the updated patch! > > It may take me a few days before I can give proper testing feedback. In > > order to reliably test this version, I first need to finish upgrading the > > FreeBSD port sysutils/slurm-wlm, which I’m currently in the middle of. > > > > Sorry for the delay — I’ll get back with results as soon as I have the port > > upgrade in place and can test the patch under real conditions. > > No problem, thanks for the update and for testing things! > > I mentioned in the previous ticket that in 24.05+ you should be able to > disable cgroups without any patches by just adding an option in cgroup.conf > - Since that comment I made what should be another improvement for 25.11+ > (https://github.com/SchedMD/slurm/commit/cbcc850de7) where you shouldn't > have to make a cgroup.conf file at all and it will simply disable the plugin > on startup. > > I hope this helps with future updates to the FreeBSD port! > > Thanks again! > --Tim Thanks for pointing that out. That’s really helpful context! In fact, your earlier note about being able to disable cgroups in 24.05+ was the reason I decided to prioritize upgrading the port rather than polishing the current version further. It made more sense to move forward and benefit from those upstream improvements, rather than spending extra time maintaining local workarounds. I’ll keep the new 25.11+ improvement in mind as well — that’ll simplify things even more for FreeBSD users going forward. Once I’ve finished updating the port to 25.05.3 I’ll circle back to testing the PGID patch and share the results here. I’ve tested the v2 patch locally — both the build and basic functional checks are green on my side. Everything integrates cleanly into the 25.05.3 port, and the plugin behaves as expected. I can see and really appreciate the cleanup you did — the refactoring makes it much more consistent with Slurm’s current plugin style. Looks good to me! Thanks again for polishing this up! Hi again! I've been looking at this and the code trying to see what it may be doing differently to just running without it and I think I see some reasonable attempts at setting the pgid in the code already. I have to admit I'm not as familiar with this kind of process control, but I was hoping that you would be able to expand on what this is doing/providing for the FreeBSD port? Thanks for your help and understanding on this! --Tim That’s a good point — there is already some PGID-related handling in Slurm’s core task/step logic, and it’s true that basic process-group creation and signal propagation happen even without a dedicated plugin. In fact, those existing PGID mechanisms in the main code are exactly what made me choose PGID as the basis for a FreeBSD task plugin to replace cgroups. The difference between what’s already implemented in Slurm and what this plugin provides mostly comes down to reliability, visibility, and explicit control of that containment on FreeBSD. Here’s what the existing code already does from what I can see: - In launch_task() and related functions, Slurm ensures each step leader becomes a process-group leader, and signals like SIGINT or SIGTERM are fanned out to that PGID with killpg(). - Since 24.05+, and even more so in 25.11, Slurm can operate without cgroups entirely; the generic task code falls back to this lightweight PGID approach automatically. What the pgid task plugin adds on top of that: 1. Explicit containment per step – The plugin clearly creates or joins a per-step PGID during launch and records it in the step state, rather than relying on the implicit behavior in generic task handling. 2. Centralized signal propagation – All step signals are consistently routed through the plugin via killpg(), ensuring complete delivery to the process group. 3. Predictable behavior on FreeBSD – By encapsulating the PGID logic in a dedicated task plugin, the launch, join, and signal paths are unified and easier to debug or extend later (for example, to add procctl(PROC_REAP_*) support). 4. Future extensibility – The plugin provides a natural place to integrate FreeBSD-specific enhancements such as the process reaper, jail-aware cleanup, or improved accounting, without touching generic task code. So in short: ⇒ The existing Slurm code handles basic process-group creation and signaling. ⇒ The pgid plugin makes that behavior explicit and consistent for FreeBSD, laying the groundwork for more complete containment and cleanup features in future iterations. I hope that clarifies it. If it helps, I can also provide a short technical summary of how the PGID logic is integrated into the step lifecycle, or outline the planned next steps such as adding reaper-based cleanup. |
Created attachment 43025 [details] Adds a "pgid" task plugin for FreeBSD that manages job steps using process groups. Provides an alternative to Linux cgroups for resource tracking. This patch introduces a new "pgid" task plugin for FreeBSD that manages process groups (PGID) as an alternative to cgroups. Since FreeBSD does not provide Linux cgroups, this plugin ensures proper tracking and management of job steps using native process group semantics. The plugin integrates with Slurm's task management interface and allows FreeBSD systems to run workloads without requiring Linux-specific features. Original patch was developed for the FreeBSD port of Slurm to improve usability on non-Linux platforms.