Skip to content
All white papers
White paper The Agentic CLI Architecture — Part 2 of 2

The Agentic CLI, Tomorrow: An Ideal-Design Synthesis

A synthesis of what the next generation of agentic CLIs should look like — grounded in current computer science and constrained to what a competent team could ship in six to twelve months.

17 May 2026 Toronto v2 10 min read
Agentic AI System design Computer science
On this page

This is Part 2 of the Agentic CLI Architecture series — the synthesis. Part 1 catalogued how Claude CLI, OpenAI Codex, and Gemini CLI are built today; this brief distils what an ideal design would look like if you began the work fresh tomorrow. The synthesis is opinionated by design, grounded in current computer science, and constrained to what a competent team could ship in six to twelve months — not a research agenda.

The pattern across the eighteen dimensions of Part 1 is consistent. Each tool has one or two genuinely forward-leaning ideas. None of them has all the right ideas. The work of this brief is to draw a coherent design from the union.

Two principles set the frame.

11
Design recommendations
Drawn from the 18-dimension comparison
4
Categories
Model interaction · Execution · State · Governance
~25
CS underpinnings cited
Merkle, capabilities, microVMs, CQRS, ADTs…
6-12mo
Implementation horizon
A competent team — not a research agenda
The synthesis at a glance — eleven recommendations across the four architectural categories.

A · Model interaction

Caching

Context-window management

Model routing

B · Execution & safety

Tool dispatch & concurrency

Trust & permission

Sandboxing

Sub-agent verification

Error handling

C · State & extensibility

State persistence

Plugins & extensions

IDE integration

Terminal UI

D · Ops & governance

Usage & cost tracking

Telemetry

The concept cloud

The recurring computer-science underpinnings — what the synthesis is built on. None of these are exotic; all have peer-reviewed literature or production track records.

DomainRecurring primitives
Memory and cachingMerkle DAGs · content-addressed storage · persistent data structures (Clojure, Git tree objects) · W-TinyLFU / 2Q eviction · Bloom-filter cache state
Concurrency and safetyStructured concurrency (Trio nurseries, async-scope) · effect systems (Koka, Eff) · lattice-based parallelism · provable data-race freedom · ownership types
Capabilities and trustObject capabilities (Mark Miller) · POLA · KeyKOS / EROS / Genode · capability attenuation · no ambient authority
IsolationFirecracker microVMs · Cloud Hypervisor · gVisor user-space kernel · seccomp-bpf + Landlock · unikernels (MirageOS) · immutable-FS layers
State and replayEvent sourcing · CQRS (Greg Young) · append-only logs (Kafka) · content-addressed storage · Merkle history · deterministic replay
VerificationProperty-based testing (QuickCheck, Hypothesis) · proof-carrying code (Necula) · contract programming (Eiffel) · differential testing
Error handlingAlgebraic data types · total error handling · RFC 9457 problem-details · let-it-crash (Erlang/OTP) · supervisor trees
UIElm Architecture · immediate-mode UI (Muratori) · pure-function rendering · property-based UI testing · accessibility as separate output
Decision policyContextual bandits (LinUCB) · Thompson sampling · cost-aware regret bounds (Cesa-Bianchi) · online learning
DistributionSigstore / cosign · WIT (Interface Types) · WASM Component Model · reproducible builds (Bazel, Nix) · content-addressed registries

What this synthesis is, and isn’t

It is a coherent design for the next generation of agentic CLIs — built on the working ideas already shipping somewhere across Claude, Codex, and Gemini, with the structural primitives that each is missing filled in from established CS.

It is not a research agenda. Each recommendation is implementable by a competent team in six to twelve months, with the closest-in-practice precedent listed so an engineer can start from running code rather than from a paper.

It is also not a critique. The three tools surveyed are each defensible bets given their parent organisation’s constraints. The synthesis is what falls out if you treat all three as a single distributed system and let the bets argue.

The point of writing it down is that there is a real opportunity here — not for a marginally better CLI but for a clean foundation an industry can converge on. The work is more boring than it sounds: it is mostly content-addressing things that aren’t, typing things that aren’t, and moving trust boundaries down one level. None of those is a research problem. All of them are engineering problems.

That foundation is what Pintle is building toward. If the picture in this brief maps to a problem on your desk, reach out.

Bring this rigor to your own AI controls.

If this series maps to a problem on your desk, a short call is the fastest way to compare notes.