← Back · companion to the syscall-slot audit (spike) and the CI testing-strategy plan

mach.ko syscall slots — implementation plan

Widen FreeBSD’s dynamic syscall-slot band in the NextBSD custom kernel so mach.ko can give each Mach trap its own slot, then drop the single-slot multiplexor. The 2026-05-13 audit disqualified patching the kernel (its option 4.4) because mach.ko had to load on stock FreeBSD. That constraint no longer holds: NextBSD now ships its own NEXTBSD kernel built from nextbsd-kernel/patches/ against releng/15.0, so a syscalls.master patch is a first-class, boot-tested change. This plan supersedes spike §4.4.

2026-06-03. Synthesized from a 4-agent review of nextbsd-redux/freebsd-src@releng/15.0 (syscall mechanism), nextbsd-redux/nextbsd (mach.ko demand), nextbsd-redux/nextbsd-kernel (patch mechanics), and ravynOS/XNU/Darling prior art. Source citations are file:line on releng/15.0 unless noted.

Hard rule We never modify or push to nextbsd-redux/freebsd-src. The fork stays a clean upstream mirror (synced only by sync-fork.yml on releng/15.0). Every kernel source change — including this one — lives exclusively as a git format-patch diff in nextbsd-kernel/patches/, applied with git apply at build time on the builder. Patches are authored against a throwaway local releng/15.0 checkout used only to produce the diff — that checkout is never the fork and is never pushed.

TL;DR Add ~32–48 extra lkmnosys slots to the NextBSD kernel via a one-file sys/kern/syscalls.master patch (plus its regenerated companions), landed as a PR in nextbsd-kernel and boot-validated by the PR smoke test. mach.ko already auto-allocates slots and resolves them by sysctl mach.syscall.<name>, so once the band is wider it simply claims one slot per trap and the multiplexor (in nextbsd) can be removed in a follow-up PR. Measured need is ~32 slots; the daemon set is overwhelmingly MIG (rides mach_msg_trap, needs no slots).

Contents

  1. Why kernel-patching is viable now
  2. The FreeBSD slot mechanism (what we patch)
  3. How many slots mach really needs
  4. Prior art: Apple, NextBSD, ravynOS
  5. Decision: widen the band now, separate table later
  6. Effort scoping: the separate table, measured
  7. The kernel patch (concrete)
  8. Two-PR rollout + ticket
  9. Risks & open questions

1. Why kernel-patching is viable now

The spike disqualified option 4.4 on a single hard constraint: mach.ko must build/load on stock FreeBSD,” so the kernel could not be forked. NextBSD’s architecture has since changed:

So a syscalls.master change is now exactly as legitimate as any other NextBSD kernel patch — and it is boot-validated on the PR. The spike’s remaining options (RESERVED-slot claiming, multiplexor) were workarounds for the no-patch constraint; with the constraint gone, the direct fix — more slots — is cleanest.

2. The FreeBSD slot mechanism (what we patch)

The dynamically-claimable band is exactly ten lkmnosys slots, syscalls 210–219 (sys/kern/syscalls.master:1272–1283; generated into sys/kern/init_sysent.c:279–288). Registration goes through kern_syscall_register() (sys/kern/kern_syscalls.c:124–156):

The table size is derived, not capped: SYS_MAXSYSCALL = highest syscall + 1 (currently 599; highest real syscall is 598) (generated by sys/tools/syscalls/scripts/syscall_h.lua:75–76), the sysentvec uses .sv_size = SYS_MAXSYSCALL (sys/amd64/amd64/elf_machdep.c:57), and dispatch bounds-checks code >= sv_size on both arches (sys/amd64/amd64/trap.c:1049–1052, sys/arm64/arm64/trap.c:156–159). Growing the table via syscalls.master automatically grows the bound — there is no separate hard cap to bump.

Critical build detail In releng/15.0 the generated files (init_sysent.c, sys/sys/syscall.h, sysproto.h, syscalls.c, systrace_args.c, syscall.mk) are checked into the tree. buildkernel compiles the committed init_sysent.c; it does not auto-run the generator. Regeneration is the explicit make sysent target (sys/conf/sysent.mk:24–41sys/tools/syscalls/main.lua; note 15.0 replaced the old makesyscalls.lua). Therefore the patch must edit syscalls.master AND include the regenerated companions (produced by running make sysent locally) — exactly how FreeBSD commits a syscall change. Editing syscalls.master alone would leave buildkernel compiling the old 10-slot table. This corrects an early assumption that buildkernel regenerates on its own.

3. How many slots mach really needs

Measured from nextbsd-redux/nextbsd (src/mach_kmod/src/mach_syscall_wire.c:531–568, src/mach_kmod/src/mach_traps.c, src/libmach/mach_traps.c):

Measured target ≈ 32 dedicated slots (refines the spike’s 30–40 estimate downward with code evidence): ~10 current + ~6–8 wide traps that cannot fold + headroom. The full XNU/ravynOS trap table is 59 entries, but ~40 of those are MIG-displaceable here, so the band need not approach 59. Recommendation: provision a 48-slot band (comfortably above the 32 need, leaves ABI headroom so we never hit the wall again).

Crucially, mach.ko registers with NO_SYSCALL (auto-allocate) and publishes each trap’s assigned number at sysctl mach.syscall.<name>; libmach resolves by name, never by hardcoded number. So widening the band is sufficient — no fixed slot numbers are required, and no userland renumbering is needed.

4. Prior art: Apple, NextBSD, ravynOS

SystemHow Mach traps are exposed
Apple XNUA separate mach_trap_table[] (128 entries) indexed by negative trap numbers; x86_64 tags the syscall-number register with a class bit (SYSCALL_CLASS_MACH, 0x01000000) so the trap dispatcher routes to the Mach table instead of the BSD sysent. (osfmk/kern/syscall_sw.c, osfmk/mach/i386/syscall_sw.h)
NextBSD (Macy, 2014–15)Patched syscalls.master to add a dedicated positive Mach band and flattened XNU’s negative table into FreeBSD’s sysent at a reserved offset. The implementations live in sys/compat/mach/. This is the direct ancestor of our approach.
ravynOS — mainInherited the original NextBSD code: syscalls.master reserves a ~120-slot Mach band at 600–720, mapping XNU trap index -N → FreeBSD syscall 600 + N; traps registered via a syscall_helper_data array in sys/compat/mach/mach_module.c. (closest precedent to this plan)
ravynOS — darwin (current)Pivoted to building real Apple XNU (Kernel/xnu/); FreeBSD demoted to userland. They abandoned Mach-on-FreeBSD for the canonical kernel.
Darling (Linux)Can’t add host syscall slots, so it intercepts in userland and bounces to its own LKM + darlingserver, preserving XNU’s negative-numbered table logically.

Two signals matter for us: (1) the proven Mach-on-FreeBSD precedent (original NextBSD & ravynOS-main) is exactly “patch syscalls.master to add a dedicated Mach band” — validating this plan’s mechanism; and (2) the canonical-fidelity end state is XNU’s separate negative-numbered table, which is where ravynOS ultimately went.

5. Decision: widen the band now, separate table later

ApproachApple-surface fidelityDebuggabilityreleng-bump churnComplexityVerdict
Widen the lkmnosys band (more auto-allocated slots; drop the mux)MediumHigh (each trap named in ktrace/dtrace)Low (pinned releng/15.0; tail-append is additive)LowAdopt now
Multiplexor in one slot (status quo, scaled)LowLow (one opaque syscall)LowMediumReject (the thing we’re removing)
XNU-style separate class-tagged mach_trap_tableHigh — but moot (libmach resolves by name)Medium–HighLowest (own file; tiny dispatch hook)~1-day C-only core, but base-kernel patch + first-of-its-kind (§6)Future / binary-compat only

Primary recommendation: widen the band now. It is what “increase the limit to what Mach needs” means, it matches the proven NextBSD/ravynOS-main precedent, it keeps per-trap debuggability, and it requires zero userland changes because mach.ko auto-allocates + resolves by sysctl. It directly unblocks dropping the multiplexor.

The XNU-style separate table is the documented ideal end state (Apple-canonical surface, lowest upstream coupling, where ravynOS landed) but it is a much larger kernel change — patching the arch syscall-dispatch path, reimplementing arg mungers, and building libsystem_kernel from XNU’s libsyscall. Out of scope for “raise the limit”; tracked as a future decision. The band-widening work is not wasted if we later migrate — it keeps the stack shipping in the meantime.

Band placement — two clean options (both fine; pick at PR time): (a) tail-append ~48 lkmnosys rows at 599+ (simplest, ABI-additive, the auto-allocator scans the whole table so placement is irrelevant to mach.ko); or (b) mirror ravynOS with a labelled 600–647 Mach band. Option (a) is recommended for minimality; (b) buys nothing extra given we resolve by name, not number.

6. Effort scoping: the separate table, measured

A follow-up 3-agent scope (2026-06-03) measured the separate-table option, since the “HIGH effort” cell above was an estimate. The work splits into a surprisingly cheap mechanical core and a real strategic cost.

Two designs people conflate — keep them apart:

DesignWhat it isCostEach trap its own identity?
(A) Module-internal table (“mux-as-table”)Convert the existing sys_mach_trap_mux_trap switch into an indexed mach_trap_table[] inside mach.ko, still in the one existing slot, op-dispatched.~80–150 LOC, 1–3 days, zero kernel patchNo — still one slot/op; tidies the mux, doesn’t remove slot pressure for wide-arg traps
(B) True kernel-hooked table (class-tagged)The C hook above + a separate Mach number space.~1-day core, but a base-kernel MD patch, first-of-its-kindYes — unlimited room, Apple-shaped

Precedent gap is the real cost driver. No shipping FreeBSD-derived project has grafted a separate negative/class-tagged Mach table into the trap path: ravynOS-main flattened Mach into the ordinary positive sysent (the 600–720 band, 600 + −(xnu index)); ravynOS-darwin runs real XNU; Darling stays in userland with its own LKM. So (B) has no reference implementation and must independently clear FreeBSD-specific concerns (Capsicum syscall filtering, ptrace/audit syscall-number assumptions, KLD reload) plus a permanent per-releng rebase of MD code — turning the ~1-day core into multiple engineer-weeks of validation + ongoing maintenance.

And the headline benefit is moot for NextBSD. The separate table’s signature payoff is an Apple-canonical number space so unmodified Apple stubs / XNU libsyscall work. But NextBSD compiles Apple source against its own libmach, which resolves every trap by name via sysctl mach.syscall.<name> and never references a trap by number. So Apple-canonical numbers buy NextBSD nothing it can use — that benefit only matters for binary-compat projects running unmodified Apple dylibs (ravynOS-darwin, Darling). Under either kernel scheme the userland cost is ~one line (libmach already feeds whatever number the sysctl returns to syscall(); a class-tag scheme only needs the num < 0 guard relaxed). (src/libmach/mach_traps.c)

Q — same effect? better way? Skipping the limit-raise and doing only the separate table. Functionally both reach the same goal: each Mach trap gets its own identity and the multiplexor goes away. The true table (B) is the more Apple-shaped architecture — but for NextBSD right now it is not a better way: its defining benefit (Apple-canonical numbers / unmodified Apple stubs) is moot because libmach resolves by name, while it costs multiple weeks + first-of-its-kind base-kernel risk + a permanent MD rebase burden. The cheap variant (A) is only a tidier multiplexor — it does not give per-trap identity or remove slot pressure, so you’d still want a few more slots anyway. Widening the band reaches the same functional effect in days with a syscalls.master patch and full per-trap ktrace/dtrace visibility. Recommendation stands: widen the band now; revisit (B) only if NextBSD later targets running unmodified Apple binaries (binary compat), where Apple-canonical numbers stop being moot.

Q — libsyscall for IOKit? No. Apple’s IOKit user API (IOServiceGetMatchingServices, IOConnectCallMethod, …) is Mach IPC: io_* MIG routines carried over mach_msg to the kernel IOKit registry. It rides mach_msg_trap (already wired) plus the MIG-generated IOKit interface, and links against NextBSD’s libmach like every other daemon — it does not require XNU’s libsyscall trap stubs, and it is unaffected by the trap-table decision. (libsyscall sits below IOKit, supplying the raw mach_msg/BSD stubs that libmach already provides.) See the hardware-registry / IOKit plan.

7. The kernel patch (concrete)

nextbsd-kernel is at a clean baseline: patches/series is empty and no Mach/syscall patch exists yet (the 0001-increase-mach-syscall-limit.patch named in the pipeline-plan doc was illustrative — it is not in the repo). Patches are plain git format-patch diffs applied with git apply over patches/series in /usr/src (.github/workflows/build.yml, “Apply patches” step).

Authoring steps

Never edit or push the freebsd-src fork. The clone below is a throwaway local checkout used only to generate the patch diff; nothing is ever committed or pushed to it. Use upstream FreeBSD (or a local mirror) at the same releng/15.0 commit the toolchain image bakes.

# 1. THROWAWAY local releng/15.0 checkout (never pushed; not the fork).
#    Use git.freebsd.org or any local mirror at the pinned commit.
git clone -b releng/15.0 https://git.freebsd.org/src.git /tmp/fbsd-src
cd /tmp/fbsd-src && git checkout -b nextbsd-widen-syscall-band

# 2. Edit ONLY syscalls.master: append ~48 lkmnosys rows at the tail (599+),
#    copy-pasting the 210-219 pattern:
#      599  AUE_NULL  NODEF|NOTSTATIC  lkmnosys lkmnosys nosys_args int
#      ... through 646 ...

# 3. Regenerate the committed companions (REQUIRED - buildkernel won't):
make -C sys/kern sysent
#    -> rewrites init_sysent.c, sys/sys/syscall.h (bumps SYS_MAXSYSCALL),
#       sysproto.h, syscalls.c, systrace_args.c, syscall.mk

# 4. Commit syscalls.master + ALL regenerated files together, then:
git add sys/kern/syscalls.master sys/kern/init_sysent.c sys/kern/syscalls.c \
        sys/kern/systrace_args.c sys/sys/syscall.h sys/sys/syscall.mk sys/sys/sysproto.h
git commit -m "kernel: widen lkmnosys dynamic syscall band to 48 slots for mach.ko"
git format-patch -1 -o /path/to/nextbsd-kernel/patches/

# 5. In nextbsd-kernel: add the filename as the (first) line of patches/series
echo "0001-kernel-widen-lkmnosys-dynamic-syscall-band...patch" >> patches/series

# 6. Verify it applies the way CI will (clean releng/15.0 tree, throwaway):
cd /tmp/fbsd-src && git stash && git checkout releng/15.0 \
  && git apply --check /path/to/nextbsd-kernel/patches/0001-*.patch

No config/NEXTBSD change is needed (a syscalls.master edit requires no kernel options). On the PR: both the amd64 and arm64 legs build the kernel, and the amd64 boot smoke test injects it into the latest continuous ISO and boots it — so a regression that breaks boot is caught before merge.

8. Two-PR rollout + ticket

The two halves are coupled but must land in order, because the nextbsd mux-drop depends on the wider band existing in the kernel it runs on.

  1. Ticket in nextbsd-redux/nextbsd capturing the scope (kernel band-widen + mux-drop), acceptance markers, and linking #168 (which lists “multiplexor removal” as a prerequisite) and this plan.
  2. PR #1 — nextbsd-kernel: widen the band. The syscalls.master patch + regenerated files + series entry. Boot-smoke-validated on the PR. On merge, the kernel continuous release refreshes with the wider table.
  3. PR #2 — nextbsd: drop the multiplexor. Only after #1 merges. In src/mach_kmod: delete the mach_trap_mux sysent + sys_mach_trap_mux_trap switch (mach_traps.c), wire each former mux op as its own NO_SYSCALL registration (mach_syscall_wire.c); in src/libmach/mach_traps.c, resolve each by name (resolve_syscall("task_set_special_port")) instead of mach_trap_mux+op.
  4. Acceptance (the nextbsd boot test): TASK-SPECIAL-PORT-OK + HOST-BOOTSTRAP-OK + BOOTSTRAP-REMOTE-OK; and each of mach.syscall.{task_set_special_port,host_set_special_port,mach_port_move_member} resolves to a dedicated ≥0 slot while mach.syscall.mach_trap_mux is gone.

9. Risks & open questions

Implementation plan, 2026-06-03. Supersedes the “patch the kernel” disqualification in the syscall-slot audit spike (2026-05-13), which predated NextBSD’s custom-kernel pipeline. Citations are against nextbsd-redux/freebsd-src@releng/15.0, nextbsd-redux/nextbsd, and nextbsd-redux/nextbsd-kernel as read by the research agents on 2026-06-03.