Ticket 23659 - Add PGID task plugin for FreeBSD
Summary: Add PGID task plugin for FreeBSD
Status: OPEN
Alias: None
Product: Slurm
Classification: Unclassified
Component: Other (show other tickets)
Version: 23.11.7
Hardware: Other Other
: C - Contributions
Assignee: Tim McMullan
QA Contact:
URL:
Depends on:
Blocks:
 
Reported: 2025-09-06 06:28 MDT by Rikka Göring
Modified: 2025-10-16 06:56 MDT (History)
1 user (show)

See Also:
Site: -Other-
Slinky Site: ---
Alineos Sites: ---
Atos/Eviden Sites: ---
Confidential Site: ---
Coreweave sites: ---
Cray Sites: ---
DS9 clusters: ---
Google sites: ---
HPCnow Sites: ---
HPE Sites: ---
IBM Sites: ---
NOAA SIte: ---
NoveTech Sites: ---
Nvidia HWinf-CS Sites: ---
OCF Sites: ---
Recursion Pharma Sites: ---
SFW Sites: ---
SNIC sites: ---
Tzag Elita Sites: ---
Linux Distro: ---
Machine Name:
CLE Version:
Version Fixed:
Target Release: ---
DevPrio: ---
Emory-Cloud Sites: ---


Attachments
Adds a "pgid" task plugin for FreeBSD that manages job steps using process groups. Provides an alternative to Linux cgroups for resource tracking. (7.33 KB, patch)
2025-09-06 06:28 MDT, Rikka Göring
Details | Diff
23659 25.11 v2 (50.08 KB, patch)
2025-09-29 13:48 MDT, Tim McMullan
Details | Diff

Note You need to log in before you can comment on or make changes to this ticket.
Description Rikka Göring 2025-09-06 06:28:11 MDT
Created attachment 43025 [details]
Adds a "pgid" task plugin for FreeBSD that manages job steps using process groups. Provides an alternative to Linux cgroups for resource tracking.

This patch introduces a new "pgid" task plugin for FreeBSD that manages
process groups (PGID) as an alternative to cgroups. Since FreeBSD does
not provide Linux cgroups, this plugin ensures proper tracking and
management of job steps using native process group semantics.

The plugin integrates with Slurm's task management interface and allows
FreeBSD systems to run workloads without requiring Linux-specific
features.

Original patch was developed for the FreeBSD port of Slurm to improve
usability on non-Linux platforms.
Comment 1 Rikka Göring 2025-09-13 18:31:28 MDT
Following up on this submission. If more detail, logs, or updates against current master would be useful, I’m happy to provide them. Thanks for looking into this.
Comment 2 Tim McMullan 2025-09-29 13:48:54 MDT
Created attachment 43174 [details]
23659 25.11 v2

I spent some time looking at this and in the process started to clean it up a bit so I had a better shot of seeing what it is doing for FreeBSD.

I'm still looking into it a bit, but will keep you up to date!

Thanks!
--Tim
Comment 3 Rikka Göring 2025-09-30 14:40:15 MDT
Thanks, Tim - much appreciated!
I’ve pulled attachment 43174 [details] (your v2 of the patch) and will validate it on FreeBSD 14.3 across a few environments (bare metal, jail, and poudriere jail) with Slurm 25.11-rc.
If I hit anything FreeBSD-specific (e.g., setpgid vs setpgrp, killpg, jail process visibility, kinfo_proc quirks), I’ll report with logs and a minimal reproducer. Otherwise I’ll post results and any small nits.

Thanks again for picking this up!
Comment 4 Rikka Göring 2025-10-03 08:38:39 MDT
Thanks a lot again for the updated patch!
It may take me a few days before I can give proper testing feedback. In order to reliably test this version, I first need to finish upgrading the FreeBSD port sysutils/slurm-wlm, which I’m currently in the middle of.

Sorry for the delay — I’ll get back with results as soon as I have the port upgrade in place and can test the patch under real conditions.
Comment 5 Tim McMullan 2025-10-03 09:23:21 MDT
(In reply to Rikka Göring from comment #4)
> Thanks a lot again for the updated patch!
> It may take me a few days before I can give proper testing feedback. In
> order to reliably test this version, I first need to finish upgrading the
> FreeBSD port sysutils/slurm-wlm, which I’m currently in the middle of.
> 
> Sorry for the delay — I’ll get back with results as soon as I have the port
> upgrade in place and can test the patch under real conditions.

No problem, thanks for the update and for testing things!

I mentioned in the previous ticket that in 24.05+ you should be able to disable cgroups without any patches by just adding an option in cgroup.conf - Since that comment I made what should be another improvement for 25.11+ (https://github.com/SchedMD/slurm/commit/cbcc850de7) where you shouldn't have to make a cgroup.conf file at all and it will simply disable the plugin on startup.

I hope this helps with future updates to the FreeBSD port!

Thanks again!
--Tim
Comment 6 Rikka Göring 2025-10-03 11:10:32 MDT
(In reply to Tim McMullan from comment #5)
> (In reply to Rikka Göring from comment #4)
> > Thanks a lot again for the updated patch!
> > It may take me a few days before I can give proper testing feedback. In
> > order to reliably test this version, I first need to finish upgrading the
> > FreeBSD port sysutils/slurm-wlm, which I’m currently in the middle of.
> > 
> > Sorry for the delay — I’ll get back with results as soon as I have the port
> > upgrade in place and can test the patch under real conditions.
> 
> No problem, thanks for the update and for testing things!
> 
> I mentioned in the previous ticket that in 24.05+ you should be able to
> disable cgroups without any patches by just adding an option in cgroup.conf
> - Since that comment I made what should be another improvement for 25.11+
> (https://github.com/SchedMD/slurm/commit/cbcc850de7) where you shouldn't
> have to make a cgroup.conf file at all and it will simply disable the plugin
> on startup.
> 
> I hope this helps with future updates to the FreeBSD port!
> 
> Thanks again!
> --Tim

Thanks for pointing that out. That’s really helpful context!
In fact, your earlier note about being able to disable cgroups in 24.05+ was the reason I decided to prioritize upgrading the port rather than polishing the current version further. It made more sense to move forward and benefit from those upstream improvements, rather than spending extra time maintaining local workarounds.

I’ll keep the new 25.11+ improvement in mind as well — that’ll simplify things even more for FreeBSD users going forward.

Once I’ve finished updating the port to 25.05.3 I’ll circle back to testing the PGID patch and share the results here.
Comment 7 Rikka Göring 2025-10-06 21:57:19 MDT
I’ve tested the v2 patch locally — both the build and basic functional checks are green on my side. Everything integrates cleanly into the 25.05.3 port, and the plugin behaves as expected.

I can see and really appreciate the cleanup you did — the refactoring makes it much more consistent with Slurm’s current plugin style. Looks good to me!

Thanks again for polishing this up!
Comment 8 Tim McMullan 2025-10-14 14:26:56 MDT
Hi again!

I've been looking at this and the code trying to see what it may be doing differently to just running without it and I think I see some reasonable attempts at setting the pgid in the code already.  I have to admit I'm not as familiar with this kind of process control, but I was hoping that you would be able to expand on what this is doing/providing for the FreeBSD port?

Thanks for your help and understanding on this!
--Tim
Comment 9 Rikka Göring 2025-10-16 06:56:48 MDT
That’s a good point — there is already some PGID-related handling in Slurm’s core task/step logic, and it’s true that basic process-group creation and signal propagation happen even without a dedicated plugin.
In fact, those existing PGID mechanisms in the main code are exactly what made me choose PGID as the basis for a FreeBSD task plugin to replace cgroups.
The difference between what’s already implemented in Slurm and what this plugin provides mostly comes down to reliability, visibility, and explicit control of that containment on FreeBSD.

Here’s what the existing code already does from what I can see:
- In launch_task() and related functions, Slurm ensures each step leader becomes a process-group leader, and signals like SIGINT or SIGTERM are fanned out to that PGID with killpg().
- Since 24.05+, and even more so in 25.11, Slurm can operate without cgroups entirely; the generic task code falls back to this lightweight PGID approach automatically.

What the pgid task plugin adds on top of that:
1. Explicit containment per step – The plugin clearly creates or joins a per-step PGID during launch and records it in the step state, rather than relying on the implicit behavior in generic task handling.
2. Centralized signal propagation – All step signals are consistently routed through the plugin via killpg(), ensuring complete delivery to the process group.
3. Predictable behavior on FreeBSD – By encapsulating the PGID logic in a dedicated task plugin, the launch, join, and signal paths are unified and easier to debug or extend later (for example, to add procctl(PROC_REAP_*) support).
4. Future extensibility – The plugin provides a natural place to integrate FreeBSD-specific enhancements such as the process reaper, jail-aware cleanup, or improved accounting, without touching generic task code.

So in short:
⇒ The existing Slurm code handles basic process-group creation and signaling.
⇒ The pgid plugin makes that behavior explicit and consistent for FreeBSD, laying the groundwork for more complete containment and cleanup features in future iterations.

I hope that clarifies it.
If it helps, I can also provide a short technical summary of how the PGID logic is integrated into the step lifecycle, or outline the planned next steps such as adding reaper-based cleanup.