← Back · supersedes the deferred plan in hwregd busyState / waitQuiet — options (#67) · same rollout as the syscall-slot plan
busyState / waitQuiet — implementation planThe Apple-shape “right way” to drive kmod autoload: a real device-quiescence signal (IOServiceWaitQuiet / IORegistryEntryGetBusyState / IOKitWaitQuiet) delivered over Mach, replacing FreeBSD’s devctl-and-a-timer heuristic. The 2026 options doc (#67) spec’d this as a four-layer design but deferred it as “requires a kernel fork”. That constraint has lifted — NextBSD ships its own NEXTBSD kernel built from nextbsd-kernel/patches/ (the same path we used to widen the syscall band), so the kernel hook is just another patch, boot-validated by the PR smoke test. This plan makes the deferred design concrete, with one material correction the scoping turned up.
Hard rules Never modify or push nextbsd-redux/freebsd-src (clean upstream mirror); the kernel hook ships as a git format-patch in nextbsd-kernel/patches/, authored against a throwaway upstream checkout. nextbsd-kernel carries patches only — no source tree, no generated files.
The options doc deferred the exact-Apple path for one reason: “custom kernel build … dropping the kernel pkgbase package … forking the one file (subr_bus.c).” Every part of that cost is already paid: NextBSD builds the NEXTBSD kernel from patches against pinned releng/15.0, FreeBSD-kernel-generic is already dropped, and the nextbsd-kernel PR boot-smoke-test (built in the syscall-band work) validates kernel patches in the live ISO. So the subr_bus.c quiescence hook is just another boot-tested nextbsd-kernel patch — not a fork.
The options doc’s sketch was “increment on a match-START hook, decrement on the existing device_attach/device_nomatch eventhandlers.” Scoping subr_bus.c shows that pair does not balance — the completion eventhandlers don’t form a clean pair with a probe-entry hook:
device_probe() with no nomatch (the invoke is guarded by bus_current_pass == BUS_PASS_DEFAULT), then re-probes next pass — N starts, 1 terminal event.device_attach()’s early ENXIO (resource_disabled) and DEVICE_ATTACH failure both return before the device_attach eventhandler fires (subr_bus.c ~2590, ~2612) — matched, so no nomatch either: a leak.device_nomatch also fires with no preceding start (devclass_driver_deleted ~811, device_gen_nomatch ~5673) — underflow.DF_DONENOMATCH suppresses repeat nomatch on re-probe — another unbalanced increment.Fix: wrap device_probe_and_attach() with a balanced start/end pair rather than pairing a probe-entry hook with the completion eventhandlers. device_probe_and_attach() is the single synchronous funnel (probe → attach) for every device that begins matching; emit device_match_start at entry and device_match_end at every return. Then the counter = “devices currently inside probe+attach,” balanced by construction (every start has exactly one end, regardless of attach-success / nomatch / multipass / attach-failure), and reaches 0 exactly when the tree quiesces. Nested/recursive probe (child buses) just nests start/end and still returns to 0. This is the right busy semantic for autoload and sidesteps every leak path above.
nextbsd-kernel patch)Two files, no regeneration (plain .c/.h, compiled directly by buildkernel — unlike syscalls.master):
sys/sys/eventhandler.h: declare a balanced pair near the existing newbus events (eventhandler.h:311-322 declares device_attach/detach/nomatch — not bus.h): typedef void (*device_match_start_fn)(void *, device_t); + EVENTHANDLER_DECLARE(device_match_start, …), and the same for device_match_end.sys/kern/subr_bus.c: EVENTHANDLER_LIST_DEFINE for both (beside :173-176); EVENTHANDLER_DIRECT_INVOKE(device_match_start, dev) at the entry of device_probe_and_attach() (~2549) and device_match_end at every return.An out-of-tree module can EVENTHANDLER_REGISTER(device_match_start, …) for a newly base-declared eventhandler — EVENTHANDLER_REGISTER is a string-name runtime lookup, so the module needs only the DECLARE in scope (via the patched header); the list symbol lives in the base kernel. (eventhandler.h:139-140)
mach_wait_quiet (nextbsd)New src/mach_kmod/src/mach_busystate.c (the IOService::_adjustBusy analog), modeled on the existing fork-eventhandler + SYSINIT pattern (kern/task.c:1333,1344) and the sysctl patterns (mach_stats.c:31-57, mach_syscall_wire.c):
static volatile int bus_busy; static int bus_quiesce_gen;device_match_start → atomic_add(bus_busy, +1); device_match_end → on the 1→0 transition, bump bus_quiesce_gen and wakeup(&bus_busy). Register/deregister via SYSINIT/SYSUNINIT at SI_SUB_KLD, storing eventhandler_tags.mach.bus.busy + mach.bus.quiesce_gen (under the existing _mach root — not hw.bus, which collides with stock newbus).mach_wait_quiet(timeout) as a dedicated blocking syscall, wired exactly like the traps we just added (arg struct in _mach_sysproto.h, guarded wrapper + sysent + wire_one("mach_wait_quiet", …) + sysctl mach.syscall.mach_wait_quiet): tsleep(&bus_busy, PCATCH, "mwquiet", timo) looped until atomic_load(&bus_busy)==0; timeout is a diagnostic backstop only.nextbsd)Keep the freeze/thaw load batching verbatim (devctl_freeze() → kldload → devctl_thaw(), hwregd.c ~476-516). Replace only the flip trigger: today the backlog→live flip fires on a 250 ms select timeout (HWREGD_BACKLOG_QUIET_MS, ~105-116, flip ~1671-1687). Swap that for the mach.ko quiesced event — block on mach_wait_quiet (or read mach.bus.quiesce_gen) and flip when bus_busy==0. Edge-triggered on real kernel quiescence instead of a wall-clock guess; the 250 ms constant is deleted.
nextbsd)Net-new in the facade (src/libIOKit/IOKit/IOKitLib.h + IOKitLib.c) — today IOServiceWaitQuiet/IORegistryEntryGetBusyState don’t exist and IOKitWaitQuiet is only a no-op stub in the launchd shims (src/launchd/freebsd-shims/IOKit/IOKitLib.h:56-60):
IOServiceWaitQuiet / IOKitWaitQuiet(mp, mach_timespec_t*) → call mach_wait_quiet(timeout); map to kIOReturnSuccess/timeout. Re-point the launchctl shim at the real impl.IORegistryEntryGetBusyState(…, *busyState) → read mach.bus.busy via sysctlbyname (or a new hwreg_get_busy MIG routine).Faithful: Apple’s IOService::_adjustBusy increments busyState at match start and waitQuiet blocks until it reaches 0; our match-start/end wrap + mach_wait_quiet-until-0 is the direct analog, and gating hwregd on busyState→0 is more Apple-faithful than the 250 ms timer. Apple delivers device-match lifecycle as IOKit matching notifications over Mach, which the facade already mirrors (Publish/Matched/Terminate → hwregd watch events); this replaces the FreeBSD devctl tap with the IOKit-quiescence-over-Mach surface.
Divergences (documented in the header, like the facade’s other compromises): (1) global vs per-entry — one kernel-global busy counter, so IORegistryEntryGetBusyState(entry) returns the global value regardless of entry (Apple propagates per-IOService busy up to providers); waitQuiet semantics stay close to faithful, per-entry queries are an approximation. (2) io_object_t is an opaque client handle here, not a live kernel Mach port; the service/connection arg is accepted-and-ignored. (3) no kernel IORegistry — “quiescence” is newbus settling surfaced via mach.ko.
| Layer | Work | Estimate |
|---|---|---|
| Kernel hook | ~6–12 LOC across eventhandler.h + subr_bus.c; the care is the balanced start/end placement across device_probe_and_attach()’s return paths + recursion | ~0.5 d hook, ~2–3 d for a provably-balanced counter + tests |
| mach.ko | new mach_busystate.c (counter, eventhandlers, sysctls, mach_wait_quiet via the established wiring) | ~1–1.5 d, ~160–200 LOC, low risk (templates exist) |
| hwregd | swap the 250 ms flip for the quiesced event; freeze/thaw unchanged | ~1–1.5 d |
| libIOKit | 3 thin wrappers + mach_timespec_t typedef; re-point launchctl stub | ~0.5–1 d |
Whole feature ≈ 1–1.5 engineer-weeks; the dominant cost is the balanced-counter design, not the mechanics.
Same shape as the syscall work — kernel capability first, then the consumer:
nextbsd-kernel: the quiescence hook. The eventhandler.h + subr_bus.c patch (balanced device_match_start/device_match_end wrapping device_probe_and_attach()) in patches/ + series. Boot-smoke-validated: the widened-kernel pattern proves the patched kernel still boots cleanly (the hook is inert without a consumer). On merge, kernel continuous refreshes.nextbsd: consume it. After PR #1 merges + continuous refreshes: mach_busystate.c (counter + mach_wait_quiet), the hwregd flip swap, and the libIOKit APIs. Boot-test asserts the new behavior — e.g. mach.bus.busy returns to 0, a mach_wait_quiet client returns, autoload still produces HWREG-AUTOLOAD-OK, and IOKitWaitQuiet works.device_probe_and_attach() entry/exit is the design that balances; verify against the enumerated paths (multipass, attach-failure, driver-deletion, recursion). A leak would hang waitQuiet; the timeout backstop prevents a permanent stall but a leak must not happen in steady state.mach_wait_quiet return and the next attach; the quiesce_gen counter is the re-sync guard.EVFILT_MACHPORT — another mach.ko kernel hook + Mach delivery, also now viable on the same custom-kernel basis).Implementation plan, 2026-06-03. Makes concrete (and corrects the counter-balance design of) the deferred four-layer plan in hwregd busyState / waitQuiet — options (#67), now that NextBSD ships its own kernel. Citations against nextbsd-redux/freebsd-src@releng/15.0, nextbsd-redux/nextbsd, and Apple XNU IOService as read by the scoping agents.