Chapter 17 / 40

The harness

The harness is the Fedora bootc box that runs in the Gennevilliers server room. Everything on it is described by plain text in two repos: sodimo/harness carries the OS, sodimo/dotfiles carries the Podman Quadlets. Rebuild the machine tomorrow, point it at both, press the button — same system, same ports, same data.

Status: In Progress. Proven on Tom’s personal machines; activates on the Sodimo box the day the Framework Desktop is racked.

For per-quadlet detail — image tag, ports, env, secrets, upgrade path — see Quadlet reference. This chapter is the architecture.


Hardware

ComponentSpec
ChassisFramework Desktop (AMD Strix Halo)
CPU / iGPUAMD Ryzen AI Max+ 395 / Radeon 8060S (gfx1151, Vulkan path)
RAM128 GB unified
Storage2 TB NVMe (LUKS + LVM for data volumes)
NASSynology on 192.168.0.x, dump drop + nightly backup target
UPSSame unit as the existing Sodimo server; shared rack

Kernel cmdline

The Strix Halo iGPU (gfx1151) only reaches the full 128 GB unified pool when the kernel is booted with the flags kyuz0 validated. The dev-box Framework Desktop currently boots with amd_iommu=off ttm.pages_limit=33554432 ttm.page_pool_size=33554432 — the pre-kyuz0 baseline. The target is:

iommu=pt amdgpu.gttsize=126976 ttm.pages_limit=32505856

Rationale: amdgpu.gttsize=126976 is the flag that hands ~124 GiB of GTT to the iGPU; without it the local-heavy slot (gpt-oss-120b UD-Q8_K_XL at 65 k ctx) OOMs on load. iommu=pt replaces amd_iommu=off — pass-through, not disabled. ttm.pages_limit=32505856 caps pinned memory at 124 GiB.

Persist via the bootc image (kargs baked in post-install, not per-host tuning). Tracked as sodimo/harness#11.

Verify on the live box:

cat /proc/cmdline
cat /sys/class/drm/card*/device/mem_info_gtt_total   # expect ~124 GiB

Fedora delta to watch: kyuz0’s host-config band is Fedora 42/43, kernel ≥ 6.18.6-200, firmware ≥ 20260110. The harness is on Fedora 44 (kernel 6.19.13-300.fc44) — one release ahead. Real-hardware smoke is mandatory before declaring the box production; pin back to F43 if gfx1151 regresses.


Local-AI reference dependency: kyuz0

Sodimo’s local-AI runtime follows github.com/kyuz0/amd-strix-halo-toolboxes as the authoritative reference for this silicon — llama-server invocation flags, kernel kargs, firmware pins, backend-selection heuristics. Treated as reference, not runtime: the gateway image stays on ghcr.io/mostlygeek/llama-swap:vulkan, and kyuz0’s inputs are copied (not pulled) into the quadlet configs.

Current pin: 1421e8706020e8d7e797f71b9f28cd3072e7f868 (captured 2026-04-22).

The mandatory Strix-Halo flag set is applied to every model entry in llama-swap.yaml — see Quadlet reference → llama-swap for the concrete cmd: blocks. Supporting docs live in the dotfiles repo:

  • sodimo/dotfiles/docs/kyuz0-toolbox.md — interpolation map (what was copied from where), RADV vs AMDVLK vs ROCm trade-offs, open questions.
  • sodimo/dotfiles/docs/resync-runbook.md — procedure for refreshing against a newer kyuz0 commit.

Two-layer architecture

The box is built from two repos with one split rule: system-scope into the bootc image; user-scope into dotfiles.

  • sodimo/harness — bootc OS. mkosi.* build config, base-Fedora package selection, cosign.pub, iso.toml, the registry-auth policy, and the handful of system-scope units: tailscaled, cloudflared, cockpit. The mkosi.extra/ tree is deliberately tiny — the image is a dumb substrate.
  • sodimo/dotfiles — chezmoi-managed app layer. Every app-layer quadlet (caddy, postfix, dovecot, rspamd, piler, llama-swap, llama.cpp, litellm, openwebui, openwebui-db, twenty-*, vaultwarden, ETL timer, email drain) plus .volume, .network, and override files under home/dot_config/containers/systemd/. chezmoi renders any .tmpl files; the .container files themselves are plain to stay podman generate-compatible. Runs user-scope, rootless, as the sodimo user.

Why split: app versions churn weekly, OS versions churn monthly; a broken quadlet fails one user service, a broken system unit can brick the boot; rootless-by-default is the safer posture for the app surface. The leger-labs upstream already runs this split at 12 quadlets.


Quadlet inventory

One-line summaries only. See Quadlet reference for image tags, ports, env, and secrets.

QuadletRole
caddyReverse proxy, fans out 127.0.0.1:80 from cloudflared into container network
cockpitWeb admin UI (cockpit.sodimo.eu)
openwebui + -db + -redisTeam chat UI at chat.sodimo.eu, dead-simple no-auth behind CF Access
litellm + -db + -redisSingle gateway (litellm:4000) for every AI caller, local and cloud
llama-swapOn-demand model router in front of llama.cpp, Vulkan backend
twenty + -worker + -db + -redisCRM at crm.sodimo.eu; MCP wrapper lives in sodimo/mcp
vaultwardenPassword vault at vault.sodimo.eu (chapter The vault)
postfixSMTP submission + MX inbound
dovecotIMAP + LMTP mailbox store
rspamdSpam filter + DKIM signer
pilerIndefinite full-text email archive at archive.sodimo.eu
sodiwin-etl (timer + service)Nightly CSV → D1 at 03:00 Europe/Paris
sodimo-email-drain (service)Pull consumer for CF Queue email_outbox — no inbound port

Tailscale runs as a system service (break-glass only, not an employee tool). No Kubernetes, no Helm, no vendor dashboard.


Topology

Fedora bootc harness — Gennevilliers

Local AI + chat

Mail

25/587/993

outbound

HTTP pull

postfix

rspamd

dovecot

piler

openwebui

litellm

llama-swap

llama.cpp

twenty

CRM

vaultwarden

caddy — TLS + routing

cockpit

sodimo-email-drain

sodiwin-etl timer

ISP

CF Queue

email_outbox

NAS

D1

The drain service is the only bridge from Cloudflare back into the box, and it is a pull — no inbound port (Principle 3).


Provisioning

Pre-req: Framework Desktop racked, NAS and UPS wired, ISP ticket HB2 open for port 25.

  1. Flash a Fedora bootc installer USB with the latest sodimo/harness image.
  2. Boot, partition the data drive with LUKS + LVM, install.
  3. Paste Tailscale auth key and tailnet join (from paper).
  4. Paste secrets env files (from paper + backup) into /etc/sodimo/secrets/.
  5. As the sodimo user: chezmoi init --apply sodimo/dotfiles — clones the repo, materializes ~/.config/containers/systemd/.
  6. systemctl --user daemon-reload && systemctl --user enable --now sodimo.target.
  7. Restore Dovecot mailstore, Paperclip Postgres, OpenWebUI Postgres from NAS snapshot.
  8. Verify: mail flows, chat.sodimo.eu answers, drain consumes queue, Cockpit green.

End-to-end target: under two hours.


Operations

Reboots. systemd starts the quadlet directory in dependency order; a 3 a.m. power cut brings mail back up before anyone notices.

Upgrades. App-layer change (model swap, Twenty bump, Vaultwarden bump) = commit on sodimo/dotfiles, deploy with chezmoi apply && systemctl --user daemon-reload && systemctl --user restart <unit>. No reboot, no image rebuild. OS-layer change (kernel, system service, COPR pin) = commit on sodimo/harness, CI builds a new bootc image, deploy with bootc switch ghcr.io/sodimo/harness:<tag> && systemctl reboot. bootc rollback is the safety net — one command, one reboot, back to the previous known-good root. Upgrades are deliberate; no auto-updates.

Observability. Cockpit is Paul’s day-one surface for starting/stopping services, tailing journals, watching storage. Caddy routes log per-hostname; Postfix + rspamd + Dovecot log to the system journal with per-unit filters.

Backups. Nightly to Synology NAS (Dovecot mailstore, Postgres dumps, Piler archive, Vaultwarden SQLite). Weekly encrypted mirror to Cloudflare R2. Full bare-metal restore rehearses in under two hours.

Handoff. The two repos are the artefact. A future engineer reads sodimo/harness for the OS and sodimo/dotfiles for the app surface and understands the box in an afternoon — no proprietary layer, no dashboard to learn, no consultant to call.


In flight: chezmoi source rewire

The harness bootc image currently bakes mecattaf/dotfiles as its chezmoi source (submodule at subprojects/dotfiles, referenced by mkosi.conf.d/subprojects.conf and chezmoi-update.service). Migrating to sodimo/dotfiles is tracked as sodimo/harness#10. Recommended path under discussion: drop chezmoi-update.service entirely and rely on the bootc-baked /usr/share/harness/dotfiles snapshot — avoids needing a deploy token for private-repo pulls, and matches the static-at-handoff posture. Updates then arrive via a new bootc image, not a daily chezmoi update.