Issue #67 · module-only, no kernel fork, no wall-clock heuristics, stock FreeBSD drivers untouched · grounded in freebsd-src
busyState/waitQuiet path requires forking the kernel and is deferred — fully documented in the deferred plan at the bottom for future consideration. Reviewing this doc before any implementation begins.IOServiceWaitQuiet / IORegistryEntryGetBusyState / IOKitWaitQuiet)? No.
libIOKit change.IOKitWaitQuiet, while IORegistryEntryGetBusyState per entry would be a global approximation (no per-IOService busy count exists without a match-start hook).busyState + exact IOServiceWaitQuiet / IORegistryEntryGetBusyState come only with the deferred kernel-fork plan (its layer 4).IOService busyState parity module-only = NO.
Exact Apple IOService busyState parity is NOT achievable from a loadable module without forking the kernel. busyState's defining semantic is an increment-at-match-START: the counter goes up the instant a device begins matching, before any probe completes, and that exact window is precisely what stock FreeBSD does not expose. Two of the four research threads ("verdict": not-feasible-module-only) prove this independently: (1) newbus has exactly three lifecycle eventhandlers — device_attach, device_detach, device_nomatch (subr_bus.c:173-175, kern_devctl.c:169-173) — and all three fire at COMPLETION only; there is no match-start eventhandler and no SDT probe at device_add_child or device_probe_and_attach entry. (2) Polling newbus state cannot reconstruct busy==0 either, because a permanently driverless device (probed, nothing matched) sits at DS_NOTPRESENT — byte-for-byte identical to a never-probed or mid-probe device; the only disambiguating bit, dev->flags & DF_DONENOMATCH, lives in the private struct _device with no KPI accessor (device_get_flags() returns a different field). The two routes that CAN see exact match-start are both off the table for production: kobj method "swizzling" of DEVICE_PROBE/DEVICE_ATTACH works mechanically from a module (cache stores a pointer into cls->methods[], so a func-pointer overwrite takes effect with no invalidation) but is lock-unsafe against concurrent dispatch, mutates private internals, and must chase every future kldload'd driver — exactly the "destabilizes stock driver matching" outcome that is forbidden; and DTrace FBT on device_probe_and_attach:entry is an explicitly-unstable instrumentation facility, not a shippable KPI. The honest conclusion: ship a wall-clock-free QUIESCENCE-by-COMPLETION signal (the inverse of busyState, but with no timer), not busyState itself.
The actual 60s autoload bug is independent of any busyState work and is already fixed in pure userland hwregd: src/hwregd/hwregd.c drains the queued /dev/devctl backlog, calls devctl_freeze() (the kernel's real probe-batching primitive — DEV_FREEZE/DEV_THAW ioctls in subr_bus.c, wrapped by libdevctl), kldloads each unique module under the freeze, then devctl_thaw() (hwregd.c:461-509). This replaced the old 60s wall-clock workaround. The one remaining wall-clock smell is the 250ms initial-backlog quiet window at hwregd.c:105-111 used to decide 'backlog drained.' That can and should be replaced with an event precondition — drain the non-blocking devctl socket until read() returns EWOULDBLOCK (queue empty) before the freeze/kldload/thaw flip — making the whole boot fix fully timer-free with no kernel module and no fork. This ships regardless of whether any busyState layer is ever built.
devmatchOption 1 is essentially "be devmatch, done properly in a daemon." The match+load mechanism is identical to FreeBSD's rc.d/devmatch; it stays on the FreeBSD rails (no busyState — that concept is purely Apple's, in the deferred plan).
FreeBSD rc.d/devmatch | Option 1 (hwregd) | |
|---|---|---|
| Match | devmatch walks the tree, matches pnpinfo against linker.hints | same, against merged /boot/kernel+/boot/modules hints |
| Load | devctl freeze → kldload each → devctl thaw | same KPI — devctl_freeze() → kldload → devctl_thaw() |
| No wall-clock | gated by rcorder position (no timer) | gated by draining the devctl backlog to empty / EWOULDBLOCK (no timer) |
| busyState / waitQuiet | none (FreeBSD has no such concept) | none |
| Hotplug | devd re-invokes devmatch per nomatch event | hwregd live-mode handles each ? inline |
The one genuine difference is forced by our environment: devmatch is gated by rcorder ("after filesystems mounted, before netif") — we have no rc.d (launchd, RunAtLoad), so Option 1 substitutes an equivalent event precondition (drain the devctl backlog to empty) for a script position. Where Option 1 improves on devmatch (still faithful, just better implementation): devmatch re-execs the whole rc script and re-parses linker.hints on every hotplug event (its own conf flags this as suboptimal); hwregd is a long-lived daemon that keeps hints parsed in memory and feeds consumers (ipconfigd, future HardwareMatch) the +attach stream over Mach pub/sub.
Summary. Don't build a busyState layer in the kernel. The actual boot bug (the 60s autoload stall) is already solved in userland hwregd via DEV_FREEZE/DEV_THAW batching around the initial devctl-backlog kldload pass. Finish hardening that (remove the 250ms quiet-window, gate the flip on a real precondition) and ship it. No module, no fork, nothing touches driver matching.
How it works. All changes live in src/hwregd/hwregd.c. The daemon already drains the queued /dev/devctl backlog, collects unique module names, calls devctl_freeze(), kldloads each, then devctl_thaw() (hwregd.c:461-509) — freeze/thaw is the kernel's real probe-batching primitive (subr_bus.c DEV_FREEZE/DEV_THAW ioctls, wrapped by libdevctl). The one remaining wall-clock smell is the 250ms initial-backlog quiet window (hwregd.c:105-111) used to decide 'backlog drained, flip to live mode.' Replace that timer with an event precondition: drain until read() on the non-blocking devctl socket returns EWOULDBLOCK (queue empty) rather than waiting a fixed 250ms. No kernel component (mach.ko / libIOKit) changes.
Risk. Lowest. Touches only our own daemon; freeze/thaw is the same KPI stock FreeBSD and devctl(8) use. Only residual: replacing the 250ms heuristic with an EWOULDBLOCK drain-to-empty loop is a small, well-understood change. No driver-matching exposure whatsoever.
Pick when. Pick this if the real goal is fixing the boot stall (it is). busyState parity is a means, not the end — this delivers the end with zero kernel risk. Recommended as the baseline ship; the busyState options below are additive only if a real consumer needs an explicit 'devices settled' event.
Summary. If a consumer genuinely needs a kernel-emitted 'boot device tree has settled' event (not busyState, but the practical thing busyState is usually polled FOR), a stock .ko can latch onto the existing interrupt-config-hook drain plus the root-mount barrier — all public KPIs, all event-driven, zero timers.
How it works. New mach.ko-side code (or a small dedicated .ko) registers config_intrhook_oneshot(cb, arg) from a SI_SUB_KLD-era SYSINIT (kernel.h:484, verified public prototype). Because SI_SUB_KLD (0x2000000) precedes SI_SUB_INT_CONFIG_HOOKS (0xa800000), the oneshot is queued and drained as part of the same one-time boot drain that blocks until every driver's interrupt-config hook completes (boot_run_interrupt_driven_config_hooks msleeps until the list is EMPTY — the WARNING_INTERVAL is a 60s printf timeout, NOT a debounce). For the stronger 'GEOM has tasted disks + all root_mount_hold tokens released' edge, gate on root_mounted() (systm.h:513, public) and/or call g_waitidle() (geom.h:272; impl geom_event.c:80-90 msleeps with timeout 0 — purely event-driven, confirmed). The module publishes a single 'boot-settled' signal up to hwregd/launchd over the existing MIG channel.
Risk. Low-moderate. config_intrhook_*, root_mounted, root_mount_hold/rel, and g_waitidle are sanctioned long-stable driver-facing KPIs used pervasively in-tree (CAM: config_intrhook_oneshot(xpt_config) cam_xpt.c:1560). Caveats to respect: (1) oneshot fires DURING the drain, not at the true empty-list edge (that edge is locked inside static boot_run_interrupt_driven_config_hooks and is NOT module-readable) — so treat the callback as 'interrupt-config reached me,' confirm true settle via root_mounted()/g_waitidle(). (2) These are ONE-TIME boot signals; there is no supported module-only ongoing 're-settled' KPI. (3) g_waitidle only drains GEOM tasting, not the whole newbus tree, and must run off the g_topology lock. Do NOT mimic kern.cam.boot_delay (cam_xpt.c:106-110) — that IS the wall-clock anti-pattern.
Pick when. Pick this as the additive busyState-substitute when a consumer needs an explicit kernel boot-settled event and Option 1's pure-userland drain isn't enough. Best wall-clock-free near-equivalent available module-only. Stack it ON TOP of Option 1.
Summary. Register the three sanctioned newbus lifecycle eventhandlers from a .ko and maintain a pending-set / completion counter. This is the inverse of busyState: it counts match COMPLETIONS (attach-success and nomatch-failure) rather than incrementing at match-start. Event-driven, no timer, but cannot represent the added-but-not-yet-probed window.
How it works. mach.ko (or a small .ko) calls EVENTHANDLER_REGISTER for device_attach, device_nomatch, and device_detach — the complete, stable set (subr_bus.c:173-175; only in-tree consumer kern_devctl.c:169-173). Each fires synchronously inline in the probe path with no debounce: device_attach on DS_ATTACHED success (subr_bus.c:2154 region), device_nomatch on probe-with-no-driver (subr_bus.c:693), device_detach on teardown. The module keeps a count of terminal events and exposes 'no new terminal events pending' upward. Loaded at SI_SUB_KLD it catches every boot device's terminal event with zero added delay.
Risk. Low for the mechanism itself — these are sanctioned, header-public, in-tree-used KPIs. The risk is SEMANTIC, not stability: because the events are completion-only, deciding 'all devices done' from them requires knowing when matching STARTED, which is unobservable module-only — so a naive 'counter hit 0' determination can falsely fire before late/queued devices begin probing, tempting you to re-add a quiet-window (the forbidden heuristic). Pair it with Option 2's root_mounted()/config-hook boot barrier to bound the window event-drivenly instead.
Pick when. Pick over Option 2 only if you need per-device granularity (which device attached/nomatched), e.g. to drive hwregd registry enrichment off real attach edges rather than a single boot-settled pulse. Otherwise Option 2 is the cleaner aggregate. Both avoid the fork.
Summary. The ONLY module-only route to true exact busyState parity (increment at match-START): overwrite each driver's kobj methods[] func pointer for device_probe/device_attach so a wrapper logs START, calls the saved original, logs END. Mechanically works with no fork. Lock-unsafe, mutates private internals, must chase every future driver — violates the 'stock drivers untouched / don't destabilize matching' constraint. Listed for completeness only.
How it works. A .ko enumerates every driver_t/kobj_class, finds the methods[] slot whose desc == &device_probe_desc / &device_attach_desc, saves the original func, and overwrites it with a wrapper. Works because KOBJOPLOOKUP re-fetches _m every call and the cache stores a POINTER into cls->methods[] (subr_kobj.c:216-228, *cep=ce), so the swap is live with no invalidation. Gives exact probe START, probe END(result), attach START, attach END(result) for all stock drivers — real Apple-busyState increment-at-start semantics. Would live in a dedicated diagnostic .ko, never in production mach.ko.
Risk. HIGH — not shippable. (1) Post-compile mutation of kobj_class.methods[].func has NO supported lock against concurrent dispatch on other CPUs — a real correctness hazard. (2) Mutates private internals with no KPI/compat contract — version-brittle. (3) Must also hook devclass_add_driver/BUS_DRIVER_ADDED to catch kldload'd drivers, or they go unmonitored. (4) Base-class method inheritance (kobj_lookup_method_mi walks baseclasses) and shared methods[] arrays cause double-wrap/miss. This is exactly the 'destabilizes stock driver matching' outcome the user forbids.
Pick when. Do NOT ship. Use only as an opt-in, never-default-loaded diagnostic .ko if you ever need to empirically observe the exact match-start window for validation. For any production parity claim, this is off the table — its existence is why 'exact module-only busyState' is technically possible yet practically a no.
Summary. The only CLEAN path to exact, stable busyState parity: add a real match-START hook in sys/kern/subr_bus.c. Explicitly OUT of scope under the module-only / no-fork constraint; named so the tradeoff is on the record.
How it works. Add EVENTHANDLER_INVOKE(device_match_start, dev) (or an SDT_PROBE) at the entry of device_probe_and_attach (subr_bus.c:2057 region) or in the device_add_child path (make_device ~1533), then have mach.ko register for it. This is the increment-at-match-start signal that does not exist in stock newbus — proven absent (zero SDT probes in subr_bus.c/kern_devctl.c; only the three terminal eventhandlers). Requires a custom buildkernel.
Risk. Stability-wise the cleanest exact answer (a one-line sanctioned-style eventhandler add). But it VIOLATES the hard no-fork / no-custom-buildkernel constraint and forces shipping a patched kernel. Listed only so the user can consciously weigh 'exact parity' against 'must fork.'
Pick when. Pick ONLY if the user later relaxes the no-fork constraint and decides exact increment-at-match-start busyState is genuinely required by a consumer. Under the current constraints, do not pursue.
Recommend Option 1 (pure-userland hwregd freeze/thaw, harden the drain-to-empty) as the baseline ship — it fixes the real 60s boot bug with zero kernel risk and honors every constraint. If, and only if, a real consumer needs an explicit kernel 'devices settled' event, add Option 2 (config_intrhook_oneshot + root_mounted() + g_waitidle()) on top: it is module-only, uses stock drivers unchanged, and is fully wall-clock-free — the best available near-equivalent to busyState given the no-fork constraint. Reach for Option 3 (completion eventhandlers) over Option 2 only when per-device attach/nomatch granularity is needed to drive hwregd registry enrichment. Do NOT ship Option 4 (kobj swizzle) — keep it as a diagnostic-only .ko — and treat Option 5 (kernel fork) as off-limits unless the user explicitly relaxes the no-fork rule to get exact parity. Bottom line: exact Apple busyState parity is not achievable cleanly module-only, but the user's actual objective (kill the boot stall with a real signal, no timers) is fully met by Option 1, optionally sharpened by Option 2.
UPDATE 2026-06-03 — no longer deferred. The no-fork constraint has lifted: NextBSD now ships its own NEXTBSD kernel from nextbsd-kernel/patches/ with a PR boot-smoke-test, so the subr_bus.c hook is just another boot-tested patch (same path as the syscall-band widen). Concrete implementation — including a correction to the counter design below (a balanced device_match_start/device_match_end wrap around device_probe_and_attach(), since “match-start++ / attach|nomatch--” does not balance across multipass / attach-failure / driver-deletion) — is in the IOKit busyState / waitQuiet implementation plan. Tracked by issue #176. Original deferred rationale kept below for history.
Status (original): deferred, not now. Kept on record so the full Apple-shape design exists if the no-fork constraint is ever relaxed for a real consumer (e.g. launchd HardwareMatch / configd) that genuinely needs exact per-device quiescence. This is the only clean path to byte-for-byte Apple IOService busyState + waitQuiet semantics, and the only one that delivers the IOKit quiescence APIs faithfully.
Four layers:
subr_bus.c patch). Add the one signal stock newbus lacks — a match-START hook: EVENTHANDLER_INVOKE(device_match_start, dev) (declare a new eventhandler) or an SDT_PROBE at the entry of device_probe_and_attach() (subr_bus.c:2057 region) / in the device_add_child path. The decrement points already exist as stock eventhandlers: device_attach (success) and device_nomatch (no-driver). Built via a custom buildkernel, dropping the FreeBSD-kernel-generic pkgbase.EVENTHANDLER_REGISTER(device_match_start) → atomic bus_busy++; register device_attach + device_nomatch → bus_busy--. This is the FreeBSD analog of IOService::_adjustBusy. Expose hw.bus.busy (sysctl), fire a "quiesced" event on the 0-crossing, and implement mach_wait_quiet(timeout) (block until bus_busy==0 — the waitQuiet primitive; the timeout is a diagnostic backstop only, never the signal). Bridged over the Mach surface mach.ko already owns.bus_busy==0 / quiesced notification as the real flip signal (in place of Option 1's userland drain). Same incremental freeze/thaw load.IOServiceWaitQuiet / IORegistryEntryGetBusyState / IOKitWaitQuiet (Apple's API — declared but unimplemented today) backed by mach.ko's RPC. This layer is what makes those IOKit APIs exact rather than the coarse/global approximation Option 2 could offer.Cost (why it is deferred). Custom kernel build (+20-40 min CI per run), dropping the kernel pkgbase package, kernel maintenance across every FreeBSD update, and forking the one file (subr_bus.c) that otherwise lets you run stock FreeBSD drivers untouched. Revisit only when a real consumer needs true match-start quiescence and that cost is judged worth it.
Q. Can a stock-GENERIC loadable kernel module observe newbus device MATCH/PROBE START (or device-ADD), or only completion (attach/detach/nomatch)? And is DTrace FBT a shippable production mechanism to hook device_probe_and_attach entry at runtime?
Sees match-start? No. There is no eventhandler, SDT probe, or any other sanctioned hook at device_add_child/make_device or at the entry of device_probe_and_attach/device_probe_child. All three device_* eventhandlers fire at terminal outcomes only: device_attach on successful attach (subr_bus.c:2154, after DS_ATTACHED), device_nomatch when probing fails to find a driver (subr_bus.c:769), device_detach at detach (subr_bus.c:2123/2128/2130). A module registered for these sees match/probe COMPLETION, never match START. The only way a module could see entry is DTrace FBT fbt::device_probe_and_attach:entry (a runtime/instrumentation hook), not a KPI.
Wall-clock free? Yes for the completion eventhandlers themselves: device_attach/device_nomatch/device_detach fire synchronously inline in the newbus probe path with no timer, debounce, or quiet-window. A module at SI_SUB_KLD (0x2000000, kernel.h:128) loads before SI_SUB_CONFIGURE (0x3800000, kernel.h:137) where SYSINIT(configure2, SI_SUB_CONFIGURE, SI_ORDER_THIRD, configure) -> root_bus_configure() drives boot probing (x86/x86/autoconf.c:73,86-99), so registering device_attach+device_nomatch catches every boot device's terminal event with zero added wall-clock delay. The caveat is semantic, not timing: completion events alone cannot reconstruct exact busyState (you would have to infer when matching STARTED, which is unobservable module-only, so any 'all devices done' determination becomes a heuristic that risks re-introducing a quiet-window).
Needs kernel fork? NO for the completion-event eventhandlers (device_attach/device_detach/device_nomatch are registerable from a stock module with no kernel-source change). YES if exact match-START parity is required: a real start signal requires adding a new EVENTHANDLER_INVOKE / SDT_PROBE at device_add_child or device_probe_and_attach entry in sys/kern/subr_bus.c, i.e. a kernel-source change. (FBT is an alternative that avoids a fork but is not a stable mechanism — see stability_risk.)
Stability risk. The three device_* eventhandlers ARE a sanctioned, stable KPI (public in sys/sys/bus.h, used in-tree by kern_devctl.c) — low risk, but they only give completion. DTrace FBT, the only module-only path to a match-START hook, is explicitly NOT a stable interface: dtrace_fbt(4) states the fbt provider 'instruments the entry and return of almost every kernel function' but warns 'fbt probes are by definition tightly coupled to kernel code; if the code underlying a script changes, the script may fail to run or may produce incorrect results.' device_probe_and_attach is a static (non-inlined, ELF-symbol) function so FBT can attach, but relying on it as a production signaling mechanism is fragile (tied to that function's existence/name/inlining across releases), requires loading dtrace.ko+fbt.ko, and is a debugging/instrumentation facility, not a programming KPI. Not shippable as a production busyState source.
sys/kern/subr_bus.c:106-108 (EVENTHANDLER_LIST_DEFINE device_attach/device_detach/device_nomatch)sys/kern/subr_bus.c:769 (EVENTHANDLER_DIRECT_INVOKE(device_nomatch, dev) in device_handle_nomatch)sys/kern/subr_bus.c:2057-2070 (device_probe_and_attach: no eventhandler at entry; calls device_probe then device_attach)sys/kern/subr_bus.c:1814 (device_probe_child: invokes DEVICE_PROBE, no eventhandler)sys/kern/subr_bus.c:1533 (make_device) / device_add_child_ordered: no eventhandler on the add pathsys/kern/subr_bus.c:2123,2128,2130 (EVENTHANDLER_DIRECT_INVOKE(device_detach, ..., EVHDEV_DETACH_BEGIN/FAILED/COMPLETE))sys/kern/subr_bus.c:2076-2154 (device_attach body: EVENTHANDLER_DIRECT_INVOKE(device_attach, dev) at 2154 fires only on DEVICE_ATTACH==0 success, after resource-disabled and attach-failure early returns)sys/kern/kern_devctl.c:169-173 (EVENTHANDLER_REGISTER device_attach/device_detach/device_nomatch — only in-tree lifecycle consumer; no others exist)sys/sys/bus.h:218-220 (EVENTHANDLER_DECLARE(dev_lookup) — name-resolution helper, not a lifecycle/match event)sys/sys/kernel.h:128 (SI_SUB_KLD = 0x2000000)sys/sys/kernel.h:135 (SI_SUB_DRIVERS = 0x3100000)sys/sys/kernel.h:137 (SI_SUB_CONFIGURE = 0x3800000)Q. Can a loadable kernel module, by polling newbus state via KPI alone (no kernel fork), produce an EXACT count of in-flight probes — distinguishing devices PENDING probe/match from those permanently DRIVERLESS (probed, nothing matched) — so busy==0 reliably means quiescent?
Sees match-start? Partially / racefully. A module can sample DS_ATTACHING (25), the transient state held inside device_attach() from just before DEVICE_ATTACH() until success (DS_ATTACHED) or failure (DS_NOTPRESENT). To sample consistently it must hold bus_topo_lock() (exported; = &Giant), the same lock device_probe_and_attach() asserts via bus_topo_assert()/GIANT_REQUIRED. But the START of probe/match (entry to device_probe/device_probe_child, before a driver is selected) is NOT marked by any state change — the device stays DS_NOTPRESENT throughout probing. A poller cannot observe 'match started,' only 'attach in progress' (DS_ATTACHING). The real per-device match/attach signals are the device_attach and device_nomatch EVENTHANDLER hooks — completion notifications, i.e. event-driven, not polling newbus state.
Wall-clock free? The polling read itself carries no timer/debounce — it is an instantaneous read of dev->state. BUT because the count is NOT exact (cannot exclude driverless DS_NOTPRESENT), any module inferring 'quiescent' from polling is FORCED to reintroduce a wall-clock heuristic (e.g. 'state stopped changing for N ms') — exactly the debounce this approach was meant to avoid. So a correct timer-free quiescence signal is NOT achievable via polling alone.
Needs kernel fork? For an EXACT busy count via state introspection: YES — requires either a new KPI accessor exposing DF_DONENOMATCH/DF_ATTACHED_ONCE, or a state-machine change so driverless devices land in a distinct state; both are kernel-source changes. For the polling-only, module-only question as posed: the needed data is unreachable, so NO module-only solution exists without a fork. (A module-only EVENT-driven approach via device_attach/device_nomatch eventhandlers + a pending-set is feasible without a fork, but that is not 'polling newbus state'.)
Stability risk. The walk primitives (device_get_children, device_get_state, device_is_attached, bus_topo_lock) are sanctioned, stable, header-declared KPI — low risk. The only way to close the gap from a module would be to dereference the private struct _device (reconstruct its layout, read dev->flags) — fragile, version-brittle, dependent on an undocumented private layout that changes between releases, explicitly not a KPI. High risk and not sanctioned.
https://github.com/freebsd/freebsd-src/blob/main/sys/sys/bus.h (device_state_t enum DS_NOTPRESENT=10/DS_ALIVE=20/DS_ATTACHING=25/DS_ATTACHED=30, ~L55-59)https://github.com/freebsd/freebsd-src/blob/main/sys/sys/bus.h (DF_* defines: DF_ENABLED 0x01, DF_FIXEDCLASS 0x02, DF_WILDCARD 0x04, DF_DESCMALLOCED 0x08, DF_QUIET 0x10, DF_DONENOMATCH 0x20, DF_EXTERNALSOFTC 0x40, DF_SUSPENDED 0x100, DF_QUIET_CHILDREN 0x200, DF_ATTACHED_ONCE 0x400, DF_NEEDNOMATCH 0x800, ~L107-118)https://github.com/freebsd/freebsd-src/blob/main/sys/sys/bus.h (exported prototypes device_get_state/device_get_children/device_is_alive/device_is_attached/device_get_flags(->devflags)/device_busy/device_get_parent/bus_topo_lock/unlock/mtx/assert; NO accessor for DF_DONENOMATCH or DF_ATTACHED_ONCE)https://github.com/freebsd/freebsd-src/blob/main/sys/sys/types.h#L307 (typedef struct _device *device_t — opaque)https://github.com/freebsd/freebsd-src/blob/main/sys/kern/subr_bus.c (struct _device private: device_state_t state; u_int flags; uint32_t devflags; u_int busy)https://github.com/freebsd/freebsd-src/blob/main/sys/kern/subr_bus.c (device_attach: dev->state=DS_ATTACHING; success->DS_ATTACHED + DF_ATTACHED_ONCE + clear DF_DONENOMATCH; failure->DS_NOTPRESENT, ~L3630-3705)https://github.com/freebsd/freebsd-src/blob/main/sys/kern/subr_bus.c (device_probe: if device_probe_child fails && BUS_PASS_DEFAULT && !DF_DONENOMATCH -> device_handle_nomatch; state stays DS_NOTPRESENT, ~L3559-3591)https://github.com/freebsd/freebsd-src/blob/main/sys/kern/subr_bus.c (device_handle_nomatch: BUS_PROBE_NOMATCH + EVENTHANDLER device_nomatch + dev->flags |= DF_DONENOMATCH; no state change)https://github.com/freebsd/freebsd-src/blob/main/sys/kern/subr_bus.c (device_get_state returns dev->state; device_is_alive state>=DS_ALIVE; device_is_attached state>=DS_ATTACHED; pure field reads, ~L2373)https://github.com/freebsd/freebsd-src/blob/main/sys/kern/subr_bus.c (device_get_flags returns dev->devflags NOT dev->flags, ~L2361)https://github.com/freebsd/freebsd-src/blob/main/sys/kern/subr_bus.c (device_is_enabled->DF_ENABLED, device_is_suspended->DF_SUSPENDED, device_is_quiet->DF_QUIET, device_has_quiet_children->DF_QUIET_CHILDREN; no DF_DONENOMATCH/DF_ATTACHED_ONCE reader)https://github.com/freebsd/freebsd-src/blob/main/sys/kern/subr_bus.c (bus_topo_lock/unlock/mtx/assert = Giant + GIANT_REQUIRED, ~L649-674; device_probe_and_attach bus_topo_assert, ~L2769)Q. Can a loadable kernel module legitimately "swizzle"/interpose DEVICE_PROBE/DEVICE_ATTACH on stock newbus drivers to observe probe/attach START+END exactly, with no kernel-source changes, in a SAFE/sanctioned and release-stable way?
Sees match-start? Approach (A) eventhandlers: NO — device_attach/device_nomatch eventhandlers fire only at completion; there is no probe-START or attach-START eventhandler in device_probe_child/device_attach. Approach (B) swizzle: YES — replacing cls->methods[].func for &device_probe_desc means the module's wrapper runs at the exact instant DEVICE_PROBE(child) is invoked in device_probe_child (~line 2738), i.e. true match/probe START, and again at attach START via DEVICE_ATTACH(dev) (~line 3920). The wrapper sees both START (before calling the saved original) and END (its return value).
Wall-clock free? YES for both approaches. The signal is event-driven off the actual probe/attach call path (swizzle wrapper) or the actual completion eventhandler invoke — there is no timer, debounce, settle window, or quiet period anywhere. Exact, edge-accurate.
Needs kernel fork? NO for both approaches. Approach (A) uses public KPIs (EVENTHANDLER_REGISTER, devclass_add_driver/DRIVER_MODULE) callable from a normal .ko. Approach (B) only mutates an already-allocated kobj_class methods[] func pointer at runtime from the module; it links against exported symbols (devclass enumeration, &device_probe_desc/&device_attach_desc which are global kobjop_desc) and requires NO recompile of the kernel or drivers. Neither needs a kernel source change.
Stability risk. SHARPLY DIFFERENT per approach. (A) is a sanctioned, stable, documented KPI (EVENTHANDLER + devclass_add_driver/DRIVER_MODULE/BUS_NEW_PASS) — shippable, release-durable. (B) the swizzle is FRAGILE and NOT shippable for a "use stock drivers untouched" requirement, for concrete reasons: (1) it mutates kobj_class.methods[].func, an internal structure not part of any KPI contract — layout/semantics can change across releases with no compat guarantee; (2) concurrency/locking: kobj IDs and compile run under kobj_mtx and the cache/methods table is treated as effectively read-only after kobj_class_compile — there is no supported API to patch a method post-compile, so a writer racing dispatch on other CPUs is unsynchronized and unsafe; (3) you must enumerate and patch every current AND future driver (drivers registered later via DRIVER_MODULE/kldload won't be swizzled unless you also hook devclass_add_driver/driver-added, compounding fragility); (4) base classes / method inheritance (kobj_lookup_method_mi walks baseclasses) mean a driver may share a methods[] array or inherit probe from a base class, so naive per-class patching can double-wrap or miss; (5) shared static methods[] across instances mean patching one class's table can affect others. Net: the swizzle "works" on a given FreeBSD build but is version-brittle, lock-unsafe, and depends on private internals — exactly the destabilizing-driver-matching outcome the user forbids. Mark it NOT shippable; use it only as an opt-in debug/diagnostic .ko, never on a production stock-driver system.
sys/sys/kobj.h:214-228 (KOBJOPLOOKUP: _ce=*_cep; if _ce->desc!=_desc kobj_lookup_method; _m=_ce->func)sys/sys/kobj.h:84 (#define KOBJ_CACHE_SIZE 256)sys/sys/kobj.h:47-50 (struct kobj_method { desc; func; })sys/sys/kobj.h:57-63 (KOBJ_CLASS_FIELDS: name, *methods, size, *baseclasses, refs, ops)sys/sys/kobj.h:87-90 (struct kobj_ops { cache[KOBJ_CACHE_SIZE]; cls; })sys/sys/kobj.h:90-93 (struct kobjop_desc { id; deflt; })sys/sys/kobj.h:99-101 (KOBJMETHOD macro)sys/kern/subr_kobj.c:93-110 (kobj_class_compile_common: assigns desc->id, fills cache with &null_method, no invalidation API)sys/kern/subr_kobj.c:178-191 (kobj_lookup_method_class returns ce pointing into cls->methods)sys/kern/subr_kobj.c:193-214 (kobj_lookup_method_mi walks baseclasses)sys/kern/subr_kobj.c:216-228 (kobj_lookup_method: *cep = ce at line 227 — cache stores POINTER into methods[])sys/kern/subr_bus.c:3865-3878 (device_probe_and_attach)Q. Is there an existing, module-readable, wall-clock-free kernel signal that a loadable .ko can latch onto to know "the device tree has settled" (at boot and/or ongoing), using only public KPI — and what is the real precedent for "wait until devices settled"?
Sees match-start? No. None of these facilities expose individual device match/probe START. config_intrhook_* only fires the module's own callback during the global boot interrupt-config drain (an aggregate, post-interrupt-enable phase), never at the moment a specific device begins matching/probing. The newbus generation counter (bus_data_generation, subr_bus.c, static int =1) bumps on device add/delete/devclass/driver changes (subr_bus.c ~1724,1841,1933,2063,2090,2195,2277,2321) but it is static (not exported) and reflects topology mutation completion, not probe start; even via the hw.bus.info sysctl a module only sees a generation number change after the fact, never a 'probe starting' edge. root_mounted()/g_waitidle() are completion/quiescence signals, not start signals.
Wall-clock free? Yes for the recommended primitives. The config-hook boot drain blocks on msleep against the list emptying (subr_autoconf.c ~147-155); its hz-based interval is only a 60s WARNING printf timeout, not a debounce — completion is driven by the last hook calling config_intrhook_disestablish/_drain, not by elapsed time. root_mounted() flips on the event of all root_mount_hold tokens being released + g_waitidle() returning (vfs_mountroot.c:756-768,893), no quiet-window. g_waitidle() has msleep timeout 0 (geom_event.c:82-83) — purely event-driven, no wall clock. CONTRAST/precedent that IS wall-clock (and to be avoided): kern.cam.boot_delay (cam_xpt.c:106-110 boot_delay/boot_callout, SYSCTL cam_xpt.c:119-121, CAM_BOOT_DELAY override cam_xpt.c:1527-1531) is an explicit tunable callout/pause for buses to settle — a hardcoded timer, exactly the kind of debounce to NOT mimic.
Needs kernel fork? NO. config_intrhook_establish/_disestablish/_oneshot/_drain (sys/sys/kernel.h:453-456), struct intr_config_hook + ICHS_* (kernel.h:443-451), cold (systm.h:56), root_mounted/root_mount_hold/root_mount_hold_token/root_mount_rel (systm.h:816-821), and g_waitidle (geom.h:347) are all public prototypes in installed kernel headers with no _KERNEL-only-private guards beyond normal kernel build, so a standard out-of-tree .ko links against them with zero kernel-source change. The ONLY things needing a fork would be reading the truly-static internals (intr_config_hook_list, root_mount_complete var directly, bus_data_generation var directly) — but the public wrappers/accessors (root_mounted(), hw.bus.info sysctl) make those unnecessary.
Stability risk. Low-to-moderate; these are sanctioned, long-stable KPIs, not private symbols. config_intrhook_* and root_mount_hold/root_mount_rel/root_mounted are the documented driver-facing APIs (used pervasively by in-tree drivers, e.g. CAM cam_xpt.c:1560, xpt_rootmount) and are very unlikely to change signature. g_waitidle is public in geom.h and used by vfs_mountroot itself. Caveats to stay skeptical about: (1) config_intrhook_oneshot semantics are 'fire during the drain', not 'fire at true end-of-drain' — relying on it as a last-event marker is fragile and unsupported. (2) The static internals (intr_config_hook_list emptiness, bus_data_generation) are NOT stable/exported and must not be poked from a module. (3) g_waitidle only covers GEOM tasting, not full newbus quiescence; treating it as 'whole device tree settled' would over-claim. (4) config hooks are one-shot at boot — there is no supported ongoing 'device tree re-settled' KPI, so any post-boot continuous quiescence detection would need a different (currently nonexistent module-only) mechanism.
sys/sys/kernel.h:443-451 (struct intr_config_hook + ICHS_QUEUED/RUNNING/DONE)sys/sys/kernel.h:453-456 (config_intrhook_establish/_disestablish/_drain/_oneshot public prototypes)sys/sys/kernel.h:332 (SI_SUB_KLD=0x2000000)sys/sys/kernel.h:350 (SI_SUB_INT_CONFIG_HOOKS=0xa800000)sys/kern/subr_autoconf.c:57-58 (static intr_config_hook_list)sys/kern/subr_autoconf.c:~147-155 (boot drain while(!STAILQ_EMPTY) msleep until list empty)sys/kern/subr_autoconf.c:~157-159 (SYSINIT boot_run_interrupt_driven_config_hooks SI_SUB_INT_CONFIG_HOOKS/SI_ORDER_FIRST)sys/kern/subr_autoconf.c:~145-169 (run_interrupt_driven_config_hooks iterates queue once via next_to_notify)sys/kern/subr_autoconf.c:~222-228 (config_intrhook_establish: if (cold==0) run_interrupt_driven_config_hooks immediately, no settle)sys/sys/systm.h:56 (extern int cold)sys/sys/systm.h:816 (root_mount_hold)sys/sys/systm.h:817 (root_mount_hold_token)