The architect loop — and why foundry-ai never pays for an LLM call

M2 shipped the MCP server, the sentinel, and spec audits. The trick — and the design tenet that keeps foundry-ai self-hosted and free at the runtime — is that the "architect" lives inside your existing agent, not inside foundry-ai.

Every other governance layer I’ve looked at burns tokens. A request comes in, the layer fires its own LLM call to “decide”, costs you per audit, and locks you into whichever model the layer chose.

foundry-ai doesn’t do that. There is no paid LLM call in foundry-ai. Ever.

Where the architect actually lives

The architect is a prompt template + an MCP tool surface + a contract. The reasoning happens inside the agent you already pay for — Claude Code, Cursor, anything MCP-aware.

When you say:

“Act as the foundry-ai architect and audit spec #3.”

…your agent loads the audit_spec prompt template via MCP, calls spec.get, master_context.get, memory.search, and sentinel.sanitize to assemble the context, reasons over it on its own subscription, and writes the verdict back via spec.set_audit.

foundry-ai persists the verdict. It enforces the chain. It surfaces the result in fnd score. It never pays a cent.

What’s in the MCP surface

Locked at M2 and stable since:

memory.search, memory.recent — FTS5 over captured events.
master_context.get — the project’s stated intent.
spec.list, spec.get, spec.create, spec.set_status, spec.set_audit — the spec lifecycle, callable from the agent.
sentinel.sanitize — local secret detection, redacts AWS / GitHub / OpenAI keys before they enter a prompt.

Why this matters for you

No cost ceiling. As you grow, your token spend stays where it already is.
No model lock-in. Switch from Claude 4.6 to 4.7 to whatever comes next — foundry-ai doesn’t care.
No vendor in the audit loop. The verdict is your agent’s, persisted in your store. Nobody else sees it.

The next post will cover M3 — the daemon — and how the fast-path hook queue takes a fnd hook invocation from ~50 ms down to a few.