Gershwin native installer backend — Copier.framework scoping plan

A fully-native, cross-platform GNUstep file-level copier to eventually replace the interim bsdtar step in the livecd installer.
Scoping plan — not yet scheduled gershwin-system / gershwin-developer

Why Context

The livecd installer (init_chroot + unionfs plan) currently copies the system with bsdtar --acls --xattrs --fflags from a read-only mount of the pristine uzip. That is a sound interim answer — base-only, file-level, good metadata fidelity — but it has two ceilings:

The ISO always ships GNUstep, so a Foundation-based copier carries no incremental dependency — the GNUstep cost is already paid. The non-Foundation surface (xattrs, ACLs, BSD flags, sparse files) is a small, stable platform shim, not a third-party library. That makes a native Copier.framework the right long-term home: one engine, controlled operation ordering, no archive round-trip, reusable across the desktop, and — with the shim — the same code on FreeBSD and Linux Gershwin.

This is a scoping doc for a future, separate PR. The interim bsdtar installer ships first and stays until this is hardened; both are file-level walkers, so the swap is a localized change.

Goal What it must get right

A "no compromise" file-level copy preserves everything a block copy would, walking a tree:

CategoryItemsNotes
Trivialmode bits, setuid/setgid/sticky, uid/gid, mtime/atime lchown + chmod + lutimensat. Ordering: chown before chmod (chown clears setuid/setgid by design).
Hardlinksst_nlink > 1 → link, don't copy (st_dev, st_ino) → first-path map; second sighting calls link(). /rescue is the canonical case (~200 links to one crunched binary).
Extended attrsFreeBSD extattr_* / Linux *xattr Same concept, different APIs and namespaces — needs both code paths.
ACLsPOSIX.1e + NFSv4 (FreeBSD); POSIX.1e (Linux) FreeBSD acl_*_link_np handles both ACL types; Linux acl_*_file with ACL_TYPE_ACCESS/ACL_TYPE_DEFAULT.
BSD file flagsschg, uchg, sappnd, … lchflags — FreeBSD-clean; Linux FS_IOC_*FLAGS ioctls map only partially ("preserve what you can, document the divergence").
Sparse filesSEEK_HOLE/SEEK_DATA Not exposed by Foundation — drop to the fd via -fileDescriptor; both kernels support the constants identically.
Special filessymlinks, device nodes, FIFOs l* variants throughout for symlinks; mknod/mkfifo; sockets are skipped by convention.
Unpreservablectime, birthtime No syscall sets either portably — document as known-unpreservable.
Per-file operation order is load-bearing. stat source → create dest (open/mknod/mkfifo/symlink) → write content (sparse-aware) → set xattrs → set ACLs → chownchmod → set mtime/atime → set flags last. Directories: recurse into children before setting the directory's own flags (once schg is on, nothing else about the entry can change).

Design Framework shape

A Copier.framework (libs-base citizen) plus a thin CLI wrapper (gscopy, or Copier per GNUstep tool-naming). The framework earns its keep because GWorkspace's file ops, a future Disk Utility, and a backup tool can all consume the same engine. Class layout falls out of Cocoa idioms:

ClassRole
GSFileCopierCoordinator — takes source/destination/options, walks the tree single-threaded, drives the per-entry pipeline. Delegate + notifications for progress and per-file errors (a GUI consumer gets sensible callbacks for free).
GSCopyOptionsFlag bag: preserveXattrs, preserveACLs, preserveFlags, preserveHardlinks, oneFileSystem, sparseFiles, excludePatterns, dryRun, deleteExtra, verify.
GSFileMetadataOne entry's full metadata; the platform shim lives in GSFileMetadata+FreeBSD.m / GSFileMetadata+Linux.m.
GSInodeMapNSMapTable keyed by (st_dev, st_ino) for hardlink reconstruction.
GSExcludeMatcherfnmatch-backed rsync-style include/exclude patterns.

The platform shim — the entire non-Foundation surface

~600–800 lines of C wrapped in Objective-C, behind #ifdef __FreeBSD__ / #ifdef __linux__ in the right files; the rest of the framework never knows which platform it's on. This is the same shape as libs-base's existing GS* platform splits (e.g. GSFileHandle.m), so it's idiomatic, not novel.

GSFileMetadata+FreeBSD.m   extattr_get_link / extattr_set_link / extattr_list_link
                           acl_get_link_np / acl_set_link_np   (POSIX.1e + NFSv4)
                           lchflags

GSFileMetadata+Linux.m     lgetxattr / lsetxattr / llistxattr
                           acl_get_file / acl_set_file  (ACL_TYPE_ACCESS / _DEFAULT)
                           FS_IOC_GETFLAGS / FS_IOC_SETFLAGS  (partial parity)

(shared)                   lseek SEEK_HOLE / SEEK_DATA  — identical on both kernels

What Foundation already gives us

NSFileManager (recursive create, symlink create/resolve, attribute queries — NSFileSystemFileNumber for inodes, NSFileSystemNumber for the one-file-system st_dev check), NSDirectoryEnumerator (the walk), NSFileHandle (I/O, and -fileDescriptor for the lseek escape hatch), NSData (buffers), NSOperationQueue (parallel byte-copy / checksum, cap at CPU count — the directory walk itself stays single-threaded).

Scope v1 boundaries

Integration How it lands in the installer

The livecd installer's normal-install path is already a file-level walker (bsdtar from a read-only mount of /dev/md0.uzip). Swapping in Copier.framework is a localized change to gershwin-system/Library/Scripts/installer.sh — or, better, the installer logic moves into a small Foundation tool that calls the framework directly. The source stays the same (the read-only uzip mount); only the copy engine changes. Until then, the bsdtar step is the supported path, and this framework is built and hardened independently.

Bonus: the same framework, pointed at the live union mount with the right excludes, also powers the other live-ISO use case — snapshotting a running session (including tmpfs-resident pkg installs) to a backing store. The installer keeps the "deploy pristine base" path; the same engine powers "capture current session" if that ever becomes a feature. Two operations, one toolchain.

Phasing

  1. P1 — engine. Copier.framework + gscopy CLI: local-to-local copy, the platform shim, the per-file operation ordering, hardlink map, sparse support, exclude matcher, dry-run. Test against a known tree with the full metadata matrix.
  2. P2 — installer swap. Replace the installer's bsdtar step with the framework (or a Foundation installer tool). Gate behind the same boot-test + manual install verification the bsdtar path uses.
  3. P3 — later. Verify mode hardening; the session-snapshot use case; only then, network transport.