libicu port plan

Concrete plan for vendoring Apple's swift-foundation-icu into freebsd-launchd-mach/src/libicu/ and shipping it as a system library at /usr/lib/system/. Replaces the guess-based "Option C is a multi-day sub-project" framing in §13 of the ICU audit with empirical findings from four parallel research agents and one major mid-plan correction (see §0).

Why this exists. The audit at freebsd-libcorefoundation-icu-audit.html recommended Option A (drop ICU). Mid-execution, CI surfaced three additional CF source files with hard ICU includes the audit missed: CFString.c (4 calls), CFTimeZone.c (25 calls), CFBundle_Locale.c (4 calls including the Apple-only ualoc_localizationsToUse). These files are cross-referenced too heavily to drop. With the "~50–80 LOC patch" budget broken and a confirmed Apple-only symbol dependency, the decision flipped to Option C. This doc captures the new plan.

0. Mid-plan vendor-source pivot — what we learned

The first vendor commit (0022d49, 2026-05-15) brought in apple-oss-distributions/ICU at tag ICU-76142.2: 137 MB, 6,853 files, license Unicode V3. While drafting the build pipeline, two findings forced a re-evaluation:

0.1 The _foundation_unicode/ namespace doesn't come from Apple's general-purpose ICU

11 of swift-corelibs-foundation's libCoreFoundation source files include ICU headers via Apple's private namespace prefix:

#include <_foundation_unicode/uloc.h>
#include <_foundation_unicode/ucal.h>
...

Searches turned up zero _foundation_unicode string anywhere in the freshly-vendored apple-oss-distributions/ICU tree, on the macOS SDK at /usr/include/, or in the Xcode SDK headers. The string lives only in CF's .c files. Where does it come from?

swift-corelibs-foundation's own CMakeLists.txt answers this:

FetchContent_Declare(SwiftFoundationICU
    GIT_REPOSITORY https://github.com/apple/swift-foundation-icu.git)

apple/swift-foundation-icu is a separate Apple repository. Its README:

This version of the ICU4C project contains customized extensions for use by the Foundation package. It is automatically extracted from Apple OSS Distribution's ICU to add Swift Package Manager support. … This package is intended to be a dependency for the Foundation package. It is not useful as a general-purpose ICU4C library because all files irrelevant to the SwiftPM build are removed.

The repo ships 212 headers natively under icuSources/include/_foundation_unicode/. The header layout swift-corelibs CF expects exists by design here, with no symlink alias needed.

0.2 macOS ground truth on Apple's ICU install layout

Apple ships /usr/lib/libicucore.dylib on macOS — a single unified library, not in /usr/local/. It's invisible to ls /usr/lib/ on Big Sur and later because modern macOS hides system libraries inside the dyld shared cache, but dlopen("/usr/lib/libicucore.dylib", RTLD_NOW) succeeds. This confirms two things: (a) ICU is a system library on macOS, not a third-party port; (b) Apple uses the unified libicucore shape, validating our choice in §3 to skip the upstream 3-lib split.

Our install layout differs from Apple's on one axis: we land it at /usr/lib/system/lib_FoundationICU.so rather than /usr/lib/lib_FoundationICU.so. This is per the install-layout spike: the system/ subdir is the project's deliberate separation between self-built Apple-source libraries (libdispatch, libxpc, libCoreFoundation, libicu, etc.) and stock FreeBSD base under /usr/lib/. It also keeps /usr/local/lib/ free for any future pkg-installed parallel (e.g. devel/icu if the gershwin overlay ever needs a separate ICU for GNUstep), with zero risk of collision.

0.3 Comparison: which upstream to vendor?

Aspectapple-oss-distributions/ICUapple/swift-foundation-icu
ICU version76.142.274.0
LicenseUnicode V3Apache 2.0
_foundation_unicode/ layoutAbsent — need symlink aliasNative
Trim levelFull ICU, all featuresSlimmed for SwiftPM
Build systemAutotools (FreeBSD-supported)CMake
Build flags for FreeBSDSet them all explicitlyOverride Darwin-specific defaults
Designed forGeneral useswift-corelibs-foundation specifically
Tree size137 MB262 MB (more pre-built data)

0.4 Decision: pivot to swift-foundation-icu

Replace src/libicu/ with apple/swift-foundation-icu. Reasoning: the _foundation_unicode/ layout is native (no symlink papering-over a namespace mismatch); Apache 2.0 is cleaner than Unicode V3 for our license stack; the package is designed for the exact CF surface we're building. The CMake build needs a few Darwin-assumption overrides for FreeBSD but doesn't require any symlinks or post-install renames. ICU 74 vs 76 is a minor version delta — Apple curates this fork specifically for swift-corelibs-foundation, so the version drift is intentional.

The original 0022d49 vendor commit will be reverted by a follow-up commit in this same session; the swift-foundation-icu vendor lands as a clean replacement.

Sections 1–12 below were drafted against the original vendor-source choice; portions about the build pipeline, header symlink hack, and 3-lib-vs-unified-lib choice were reasonable for that codebase but no longer apply once we pivot. The build-defines list (§4), CF-side rewire instructions (§9), dropped-CF-files matrix (§10), and ICU-symbol inventory (§8) carry over directly.

1. What we (originally) vendored, what we didn't

Records the pre-pivot vendor shape; superseded by §0.4. Kept for context.

Vendored: 2026-05-15, commit 0022d49. Upstream: github.com/apple-oss-distributions/ICU tag ICU-76142.2. License: Unicode License V3 (permissive, BSD-like). Tree size: 137 MB at src/libicu/ (109 MB of that is the CLDR data sources at icu/icu4c/source/data/).

Contents

  1. 0. Mid-plan vendor-source pivot
  2. 1. What we (originally) vendored, what we didn't
  3. 2. Apple-vs-upstream divergence (apple-oss-distributions)
  4. 3. Build strategy — CMake post-pivot
  5. 4. Required CPP defines
  6. 5. CMake invocation
  7. 6. Header layout (now native)
  8. 7. Data bundle — Path FAST vs Path TRIM
  9. 8. Which ICU symbols CF actually calls
  10. 9. Re-threading libCoreFoundation
  11. 10. Dropped CF files — which can come back
  12. 11. Commit-by-commit plan
  13. 12. Open questions for review

1. What we vendored, what we didn't

The vendor commit drops 6,853 files into src/libicu/. To keep the working tree tractable, the following upstream subtrees were excluded:

PathSizeReason
icu/icu4c/source/test/~50 MBConformance + unit tests — we don't run them on the ISO
icu/icu4c/source/ICU.xcodeproj/~2.2 MBApple's Xcode project — we use autotools
icu/icu4c/source/allinone/~96 KBXcode all-in-one workspace
icu/icu4c/source/samples/~212 KBExample programs
icu/icu4c/source/xc_*~12 KBXcode test plans / configs

What's kept:

2. Apple-vs-upstream divergence

The source tree under icu/icu4c/source/ is 99% pristine upstream IBM ICU. Apple's divergence is cleanly separated into:

3. Build strategy — CMake (post-pivot)

swift-foundation-icu ships a CMakeLists.txt at icuSources/CMakeLists.txt requiring cmake_minimum_required(VERSION 3.24). Build via:

cmake -G Ninja /tmp/libicu/icuSources \
    -DCMAKE_INSTALL_PREFIX=/usr \
    -DCMAKE_INSTALL_LIBDIR=lib/system \
    -DCMAKE_BUILD_TYPE=Release
ninja
ninja install

This matches the existing libdispatch build pattern in build.sh exactly — same cmake -G Ninja invocation shape, same CMAKE_INSTALL_LIBDIR override to land libs at /usr/lib/system/.

Lib output: unified libicucore-style. The CMakeLists sets U_COMBINED_IMPLEMENTATION, U_COMMON_IMPLEMENTATION, U_I18N_IMPLEMENTATION, U_IO_IMPLEMENTATION, and U_TOOLUTIL_IMPLEMENTATION; the produced library matches Apple's macOS libicucore.dylib shape. CF's link line becomes -l_FoundationICU (or whatever target name we settle on) instead of -licuuc -licui18n. This is the same shape Apple ships on macOS — one library file at /usr/lib/system/.

Toolchain (all already in buildpkgs.txt):

The gmake restore that was needed for the autotools path is no longer required — this is a pure win.

4. Required CPP defines

swift-foundation-icu's CMakeLists.txt already sets most of these. The post-pivot work is mostly overriding a few Darwin assumptions:

DefineUpstream defaultAction for FreeBSD
U_DISABLE_RENAMING1 on Darwin onlyForce to 1 on FreeBSD — CF references un-renamed symbols (ucol_open, not icu_74_ucol_open); without this CF won't link. Patch the CMake conditional or set -DU_DISABLE_RENAMING=1 in CMAKE_C_FLAGS.
U_COMBINED_IMPLEMENTATIONSetKeep — produces unified library matching Apple's libicucore shape.
U_COMMON_IMPLEMENTATIONSetKeep.
U_SHOW_CPLUSPLUS_API=1SetKeep.
U_SHOW_INTERNAL_API=1SetKeep.
U_HAVE_XLOCALE_H=1SetVerify — FreeBSD does ship <xlocale.h> (in libc) but the API differs slightly from macOS. May need to patch the value or add a compat shim depending on which xlocale calls CF exercises.
MAC_OS_X_VERSION_MIN_REQUIRED=101500SetDrop on FreeBSD — this gates Apple-version-specific code paths in ICU sources; meaningless on FreeBSD and may pull in macOS-only API references.
U_HAVE_STRTOD_L=1SetKeep — FreeBSD libc provides strtod_l.
U_TIMEZONE_PACKAGE="icutz44l"SetVerify timezone package ships in vendored data; otherwise drop.

One additional CF-side define is needed (in libCoreFoundation/Makefile, not ICU's): __HAS_APPLE_ICU__=1. This enables CF's Apple-ICU-only code paths (e.g. ualoc_localizationsToUse in CFBundle_Locale.c). Three currently-dropped CF files (CFLocale.c, CFCalendar.c, CFDateFormatter.c) also have __HAS_APPLE_ICU__ guards around Apple-private ICU extensions; setting it to 1 lets them rejoin SRCS.

5. CMake invocation

The exact in-chroot reproducer (lands in build.sh's libicu step):

mkdir -p /tmp/libicu-build && cd /tmp/libicu-build && \
cmake -G Ninja /tmp/libicu/icuSources \
    -DCMAKE_INSTALL_PREFIX=/usr \
    -DCMAKE_INSTALL_LIBDIR=lib/system \
    -DCMAKE_BUILD_TYPE=Release \
    -DCMAKE_C_COMPILER=clang \
    -DCMAKE_CXX_COMPILER=clang++ \
    -DCMAKE_C_FLAGS="-DU_DISABLE_RENAMING=1" \
    -DCMAKE_CXX_FLAGS="-DU_DISABLE_RENAMING=1" && \
ninja && \
ninja install

Key shape:

Anticipate one or two CMakeLists patches for the Darwin-specific assumptions called out in §4 (the MAC_OS_X_VERSION_MIN_REQUIRED compile def and the xlocale handling). These patches land in the libicu vendor tree as project-side modifications, recorded in src/libicu/README.md.

6. Header layout (now native)

11 CF source files include ICU headers via Apple's private namespace prefix:

#include <_foundation_unicode/uloc.h>
#include <_foundation_unicode/ucal.h>
#include <_foundation_unicode/uchar.h>
...

swift-foundation-icu ships these headers natively at icuSources/include/_foundation_unicode/. 212 headers in that directory. The CMakeLists installs them to ${CMAKE_INSTALL_INCLUDEDIR}/_foundation_unicode/, which with our --prefix=/usr lands at /usr/include/_foundation_unicode/. No symlink alias, no install-time rename, no post-install fixup — the layout CF expects is the layout the package produces.

This is the single biggest payoff from the §0 vendor-source pivot: the namespace-mismatch problem that drove a symlink workaround in the original draft simply doesn't exist with this upstream.

7. Data bundle — what we ship + the repo bloat tradeoff

swift-foundation-icu pre-builds ICU's CLDR data into 4 hex-encoded source-file chunks at icuSources/common/icu_packaged_main_data.{0,1,2,3}.inc.h (~50 MB each, 200 MB total). One icu_packaged_data.cpp #includes all four; the compiler emits the hex literals into the .o, the linker bakes them into the data section of libicucore.so.

LayerSizeNotes
Repo (vendored hex source)~224 MB raw / ~60–100 MB git-packedBuild-time only; never ships on the ISO
Build-tree intermediate (.o files)~40 MB raw bytesHex decodes ~5:1 to binary
Installed libicucore.so~40–50 MB totalPer-byte data + ~3–5 MB of compiled C/C++; lands at /usr/lib/system/
Headers~5.5 MBLands at /usr/include/_foundation_unicode/ — build-tools only, runtime cost zero

For comparison: macOS ships /usr/lib/libicucore.dylib at roughly the same size — Apple's slimming applied to the swift-foundation-icu chunks gets us to the same place.

7.1 Decision: vendor the hex chunks

Honor the project's vendor-everything pattern. The 4 chunks are how Apple ships swift-foundation-icu — treating them as fetch-at-build-time would set a precedent that some vendored deps are fetchable, weakening the rule. Trade: ~60–100 MB permanent repo growth (after git-pack), 4 GitHub 50-MB push warnings (under the 100-MB hard limit, accepted), slower fresh clones forever. Reproducible offline builds remain possible. Bit-for-bit reproducible from git SHA alone. No GitHub LFS infrastructure or distfile cache to maintain.

Safety net: if the push fails or we change our minds, git filter-repo can strip the 4 chunks from history in a single one-shot rewrite. Same operation could simultaneously strip the abandoned 0022d49 apple-oss-distributions vendor commit if we want a clean cut.

7.2 Locale trim — deferred

swift-foundation-icu ships its data bundle pre-built — we don't drive the data builder ourselves. Trimming locales would mean either (a) regenerating the chunks ourselves from upstream IBM ICU's data/ tree using icutools.databuilder + a locale-filter JSON, or (b) accepting whatever locale set Apple ships and letting unused locales dead-code-strip naturally (they won't — ICU loads them from the data section by name lookup). Path (a) is a real engineering project; defer until measurement says it matters.

8. Which ICU symbols CF actually calls

From the 4 in-SRCS CF files that touch ICU (per the integration-shape agent audit, 2026-05-15):

CF fileICU symbolSublibraryApple-only?
CFString.cu_hasBinaryPropertylibicuucNo
CFString.cu_getIntPropertyValuelibicuucNo
CFTimeZone.cu_strlenlibicuucNo
CFTimeZone.cucal_close + 8 more ucal_*libicui18nNo
CFBundle_Locale.cualoc_localizationsToUselibicuucYes
CFStringUtilities.c8 × ucol_*libicui18nNo

The Apple-only ualoc_localizationsToUse is the symbol that disqualified FreeBSD's devel/icu port and forced the vendor-Apple-ICU decision.

9. Re-threading libCoreFoundation

Once libicu is built and installed, the changes to src/libCoreFoundation/Makefile:

# Add to CFLAGS (the CF-side, after the existing CFLAGS+= lines):
CFLAGS+=    -D__HAS_APPLE_ICU__=1

# Add to LDADD (after the existing -ldispatch -lBlocksRuntime -lpthread).
# swift-foundation-icu produces a single unified library; the exact
# CMake target name (libicucore.so / lib_FoundationICU.so / etc.)
# will be confirmed at build time and pinned here:
LDADD+=     -l_FoundationICU         # placeholder; confirm post-build

The single-library link line is simpler than the split-lib -licuuc -licui18n -licudata we'd have needed under the apple-oss-distributions/ICU plan, and matches Apple's macOS shape (one libicucore.dylib).

SRCS additions land in §10.

10. Dropped CF files — which can come back

The libCoreFoundation Makefile currently drops 16 CF .c files (the Option A drop list). With ICU wired in, these split three ways:

10.1 Safe to re-add immediately

Use only standard ICU symbols (no Apple-only APIs). Add to SRCS once ICU is linked:

10.2 Safe to re-add with __HAS_APPLE_ICU__=1

Have Apple-private code paths gated by __HAS_APPLE_ICU__; with the flag set + Apple's ICU linked, the Apple paths activate:

10.3 Verify first

Need a per-file check before re-adding:

Net: the SRCS list grows from 68 back up toward 84 (full upstream). The dropped BlockRuntime/ stays dropped (libdispatch provides it).

11. Commit-by-commit plan

Tasks already created in the in-session task list. Numbering is the task ID. Post-pivot, the plan is shorter than the original autotools-based draft because CMake matches our existing libdispatch build pattern exactly.

TaskCommitFiles touchedCI marker added
27 (superseded)libicu: vendor Apple ICU into src/libicu/ (apple-oss-distributions)6,853 vendored files + READMEnone
33libicu: replace src/libicu with apple/swift-foundation-icuvendor swap + README rewritenone
28build.sh: cmake/ninja libicu in chroot, install to /usr/lib/system/build.shnone
28libicu: any FreeBSD-specific CMakeLists patches (xlocale, MAC_OS_X_VERSION_MIN_REQUIRED) discovered in CIsrc/libicu/icuSources/CMakeLists.txtnone
31tests: test_libicu.c (u_init + u_getVersion + ucal_open) and ICU-OK markersrc/libicu-tests/, overlays/usr/tests/.../run.sh, tests/boot-test.shICU-OK
29libCoreFoundation: -l_FoundationICU, -D__HAS_APPLE_ICU__=1, restore 7 SRCS files (the safe-now set)src/libCoreFoundation/Makefilenone
29libCoreFoundation: restore 7 __HAS_APPLE_ICU__-gated SRCS filessrc/libCoreFoundation/Makefilenone
20build.sh: rebuild libCoreFoundation against ICU + test_corefoundationbuild.sh, smoke harnessCOREFOUNDATION-OK
30docs: rewrite freebsd-libcorefoundation-icu-audit.html §14 with the Option-A miss + Option-C pivot recordpkgdemon.github.ionone

Each commit gets its own CI iteration. Expect 2–4 fixup commits in task 28 for FreeBSD-specific CMake assumption overrides; the swift-foundation-icu CMakeLists is small and well-structured, so the FreeBSD port surface is contained.

12. Open questions for review

None blocking. The two questions in earlier drafts have been resolved:

  1. libicuio: moot post-pivot. swift-foundation-icu's CMakeLists builds common/ + i18n/ + io/ as one unified library; no per-subdir disable knob exists like autotools' --disable-io. The 216 KB io/ subtree goes in for free as part of libicucore.
  2. devel/icu shadowing: not a real concern. pkgbase ships no ICU, nothing in buildpkgs.txt (cmake / ninja / pkgconf) pulls it in transitively, devel/icu would install to /usr/local/{lib,include}/, and our CF CFLAGS never reach into /usr/local/. Different prefixes, no collision possible.

Net effect: ~30 MB added to the ISO (libicuuc + libicui18n + libicudata under /usr/lib/system/), all 16 dropped CF files come back, CF surface is at parity with macOS for the launchctl needs. The original "no /usr/local ever" rule holds — libicu lives at /usr/lib/system/, headers under /usr/include/unicode/ with /usr/include/_foundation_unicode as a symlink alias. Apple-only ualoc_* works because we're shipping Apple's ICU.

Plan derived from four parallel research agents run against the freshly-vendored src/libicu/ tree on 2026-05-15: (1) Apple-vs-upstream divergence audit, (2) FreeBSD autotools build-path audit, (3) CF↔ICU integration-shape audit, (4) ICU data-size and ship-plan audit. Synthesized into this single source of truth for execution; supersedes §13 of the ICU audit which estimated Option C as a "multi-day sub-project" without the empirical grounding this doc provides.