Chapter 27 / 40

Computer-use agent — out of scope

Status: Out of scope / deferred. This chapter documents a capability that is explicitly not part of the handoff. It exists so that a future reader of the manual understands what was considered, what was decided, and what it would take to revisit.


What the black-box principle decides.

Chapter 14 states the rule: Sodiwin is read-only to everything Sodimo runs outside it. The only channel between Sodiwin and the rest of the stack is the nightly CSV dump (Chapter 24). There is no write-path. Orders entered today are still entered by a human, in Sodiwin’s Windows UI, on the Windows workstation in Gennevilliers. Nothing about this changes during the engagement.


What a computer-use agent would have been.

A computer-use agent is a program that drives a graphical application the way a person would — moving a cursor, clicking fields, typing into inputs, reading the screen back. Because Sodiwin offers no API, no scripting surface, and no batch import that Sodimo can safely use, a write-path to Sodiwin necessarily goes through its UI.

The concrete shape, if it existed, would be:

  • A Windows VM — separate from the existing Sodiwin workstation — running the Sodiwin client, reachable from the Fedora harness over Tailscale. The VM is Windows because Sodiwin is Windows; Fedora cannot run the client natively.
  • A Paperclip-scheduled adapter that reads an order from a queue and replays it as GUI actions against the VM.
  • A nightly smoke-test job running at 03:00 UTC — a canary path against an empty job queue — that screenshots a known-good flow and alerts on pixel-level drift. Without this, a Sodiwin UI update silently breaks the adapter.
  • An identity model: the agent runs as a single Sodiwin service account; run attribution for the unified ledger (Chapter 15, Principle 2) comes from the adapter layer, not from Sodiwin itself.

Why it is not in scope.

Three reasons, in order of weight.

The value is speculative. Sodimo enters ~20 orders per day. A write-path agent that succeeds 90% of the time is worse than the current process, because the 10% failure mode is an invisible mis-keyed order that surfaces days later in a customer complaint. Nightly smoke-tests catch adapter drift; they do not catch order-content errors. The confidence threshold for automation to be net-positive is high, and reaching it costs more engagement time than it saves.

The maintenance burden lands on nobody. Sodiwin’s UI changes without warning when its vendor pushes an update. Each change risks a click-path break. Post-handoff, Sodimo has no one to own the adapter. The engagement ends with a working system or it does not end; a write-path agent is a system that might be working on any given morning and nobody at Sodimo is positioned to check.

The static-at-handoff test fails. Chapter 15 Principle 1 asks whether a component still works if the engineer disappears tomorrow and upstream ships a breaking change. A computer-use agent fails this test by construction — upstream (Sodiwin) ships UI changes, and the only fix is hand-maintenance the team cannot do.


What a future engineer would need to revisit this.

If a future engineer decides the write-path is worth building — perhaps because volume has grown and manual entry is the bottleneck — here is the starting surface.

  • A Paperclip adapter. Paperclip (Chapter 44) already owns scheduled agent runs and the run ledger. A new adapter type sodiwin_write fits the existing pattern: queue read, action execute, ledger write.
  • A Windows runner. A dedicated Windows VM separate from Florian’s workstation. Isolating the agent from the human workflow is mandatory — a shared host means a human action and an agent action can collide mid-flow, with undefined results.
  • A nightly smoke-test at 03:00 UTC. The smoke-test runs the canary path end-to-end against an empty queue, captures screenshots, diffs them against a versioned baseline, and pages on mismatch. Without this, drift is invisible until a real order goes wrong.
  • A rollback surface in Sodiwin itself. Agent-entered orders need to be identifiable — a tag in the notes field or a dedicated sales rep code — so that an operator can query and void agent-originated rows if the adapter misbehaves.
  • Two weeks of shadow-mode. Before the adapter is authoritative, it runs in parallel with a human entering the same orders, and the two are diffed daily. Parity over two weeks is the entry bar.

Rough effort: two to three weeks of engineering for a solo developer who already knows the stack, plus a month of stabilization under real traffic.


Decision traceability. D-091 locks the scope: nice-to-have, Windows VM (not Fedora), not shipping this engagement. D-141 defers the cloud-vs-local model choice to a post-handoff re-evaluation. D-092, D-142, D-143 sketch the pilot ramp, the post-engagement maintainer question, and the nightly smoke-test — all deferred. Chapter 55 (decisions annex) has the full annotations.