The audit ledger: turning every AI decision into a reviewable artifact

2026-05-25 · Avery NXR

There is a quiet pattern in modern AI dev tools.

The model writes code. You read the code. You decide whether to commit it.

That sounds reasonable until you notice what's missing. You can see what the model produced, but you can't see why. You can't see which alternatives it considered. You can't see what it had to assume about your stack. You can't see whether a decision was confident or a coin flip. And six months from now, when something breaks in the auth flow, you can't reconstruct the conversation that produced it.

The model writes code, you read the code, you decide whether to commit it — but the unit of review is the diff, not the decision. That's where Avery NXR is different.

What the ledger records

Every time a generator runs in Avery NXR, a structured record is written to an audit ledger inside the project.

For every decision the model makes, the ledger captures the prompt, the generator that ran, every file touched (with line ranges), the alternatives the model weighed, the choice it landed on, and a confidence band. For a typical scaffolding session — auth, billing, dashboard, three CRUD models, a job queue — the ledger contains a few hundred entries.

You can open it in the desktop app. You can grep it from the command line. You can render it as a static HTML page and check it into the repo. It is not a black box. It is not a log file you'll never read. It is the same shape as a pull request review, except every line of code is annotated with the model's reasoning.

Why a ledger and not a chat log

Chat logs are the obvious solution. They are also the wrong solution.

A chat log captures the conversation. The ledger captures the commitments. The difference matters when you come back six months later. A chat log says "the developer asked for auth, the model offered three approaches, here's the transcript." The ledger says "auth uses NextAuth with email + Google providers, the model picked NextAuth over Lucia because the project flagged Vercel as the deployment target, see audit entry #47 for the full comparison."

When the auth flow breaks, you don't want to re-read a conversation. You want to know what decision was made, why, and what the alternatives were — in a structured form a teammate can grep.

What this changes about reviewing AI-generated code

Most teams that adopt AI dev tools end up in one of two regimes.

In the first regime, a senior engineer reviews every diff the model produces, line by line, the way they would review a junior engineer's PR. This is safe but expensive. It eats the speed advantage of the tool.

In the second regime, the diffs are too large and too frequent to review carefully, so they get a cursory look and merged. This is fast but unsafe. The model's mistakes ship.

The ledger introduces a third regime. The reviewer reads the ledger first. The ledger tells them which decisions were confident, which were marginal, and which were guesses. The reviewer focuses their attention on the entries the model itself flagged as uncertain. The diff is still there, but the ledger tells them where to look.

It is the same move that PR templates made for human code review a decade ago — surface the decisions, not just the changes.

What this changes about trust

We have a working theory about AI tooling that goes something like this. Users do not trust AI tools because the tools are correct. Users trust them because the tools are legible. A tool that is right ninety percent of the time and tells you which ten percent it is unsure about is more trustworthy than a tool that is right ninety-five percent of the time and refuses to tell you which five percent is wrong.

The audit ledger is the legibility layer. It is not a claim that Avery NXR is correct more often than other tools. It is a commitment that whatever Avery NXR does, it will tell you what it did and why.

For solo developers, that means you can review your own AI sessions the same way you'd review a teammate's branch. For teams, it means the model's work product is auditable in the same shape as everyone else's work product. For regulated environments, it means there is a paper trail.

What's next

The first version of the ledger ships with launch. It is structured JSON plus an HTML renderer. The second version, on the roadmap for late 2026, adds the ability to ask questions of the ledger directly — "show me every decision that touched the billing flow," "show me the entries where confidence was below 70 percent," "show me what changed about our auth stack between June and August."

The bet is that the artifact will outlive the tool. Even if a team migrates off Avery NXR years from now, the ledger remains a useful record of how the application came to look the way it does. The model wrote the code, but the ledger is what they can hand to the next developer.