Avery.Software — Native Execution Runtime
RuntimeUse casesPricingHelpBlog
← All postsBlog

Medical imaging and radiology: AI on the most sensitive pixels in healthcare

2026-05-29 · Avery NXR

Radiology has been transformed by AI more thoroughly than almost any other medical specialty. The reasons are clear in hindsight. Medical images are large, structured, and amenable to machine analysis. Radiology workflows are repeatable. The data is abundant. And the impact of better-or-faster reads is measured directly in patient outcomes.

The current state of practice in most modern radiology departments: every image goes through AI before a radiologist sees it. Classification of normal vs. abnormal. Anomaly detection with bounding boxes. Comparison against prior studies. Report drafting based on the image and the patient's chart. Workflow prioritization by suspected severity.

The bill for the language-model layer on top of this — the report drafting, the chart summarization, the patient communication — is real, growing, and operating on data subject to the most stringent privacy framework in any operational domain.

The math

A representative mid-sized hospital system performs hundreds of thousands of imaging studies per year. A large multi-hospital system performs several million. Each study produces an image (or set of images), a report, a chart summary, and often patient communication that needs to be drafted.

The language-model workload on top of the imaging pipeline includes: drafting the radiology report from structured findings, summarizing the patient's relevant history from the chart, comparing the current study to priors, generating patient-facing communications, and producing the structured data that flows into the EHR and the billing systems.

A reasonable aggregate token budget per study is fifteen thousand input tokens and twelve hundred output tokens, at frontier pricing about $0.063 per study.

A mid-sized hospital system performing five hundred thousand studies per year at $0.063 each is about $31,500 per year — modest. For a large health system performing five million studies per year, the bill is around $315,000 per year. For very large hospital networks and national radiology services, the bill climbs into seven figures per year.

These numbers exclude the upstream imaging AI itself (the classification and detection models, which are usually domain-specific computer vision rather than language models), the PACS systems, the dictation infrastructure. The language model layer on top is the line item we're examining.

Why radiology is structurally a local-SLM case

The properties that favor local inference are all present, with several at the maximum strength of any workload we've covered.

The work is narrow within a department. A model fine-tuned on a hospital's own radiology reports, by specialty, will outperform a general medical model on that hospital's reports.

The work is repetitive. Imaging findings cluster into a finite set of patterns. Report structures follow predictable templates. Specialization compounds across hundreds of thousands of studies.

The privacy story is HIPAA-mandated and structurally restrictive. Medical images contain not just diagnostic information but PHI in metadata, often in burned-in annotations, and increasingly in the images themselves. The BAA constraints on cloud LLM use in healthcare apply with particular force here.

The latency story matters acutely. In emergency radiology workflows, the AI summary or comparison needs to land within seconds — often while the patient is still in the imaging suite or being transported.

The audit trail matters for both quality and malpractice. Radiology decisions are subject to retrospective review, internal QA, and external litigation. A local model that writes structured logs of what it considered produces an audit trail useful in all three contexts.

The integration realities

Radiology has unusual deployment realities compared to other healthcare AI workloads.

The PACS systems are typically on-premises or in a tightly controlled private cloud. Adding AI capabilities means working within an architecture that is already local-first by default. Cloud-LLM integrations have to bridge through this architecture, which often creates friction, latency, and compliance review work.

The radiology workflow is already optimized for fast turnaround. A radiologist reading studies in a tertiary care center reads hundreds per shift. The AI layer needs to keep up; anything that adds latency to the read changes the workflow shape.

The integration with the broader EHR is well-defined but rigid. Radiology departments have specific data exchange standards (HL7, FHIR, DICOM) that the AI layer has to respect.

These realities tilt the architecture toward local inference more strongly than in some other healthcare workloads. The radiology environment is already structured to keep data local; adding a local SLM to that environment is a natural extension, while adding cloud LLM integration is structurally awkward.

What changes with local inference

A radiology AI workflow on a local SLM looks like this.

A model is fine-tuned on the department's report corpus, by specialty (chest radiology, neuroradiology, musculoskeletal, breast imaging, and so on). The fine-tune captures the specific language conventions of each subspecialty.

The model runs on infrastructure the radiology department controls — typically integrated with the PACS, often on the same servers that run the imaging AI. The deployment is documented and meets the HIPAA requirements of the institution.

Studies flow through the imaging pipeline. The local SLM produces draft reports, history summaries, comparison narratives, and patient communications. The radiologist reads, edits, and signs. Everything happens inside the institution's security boundary.

The cost flips. Study volume can grow without the bill spiking.

The audit trail is local, structured, and reviewable in the same way the rest of the radiology QA process is.

Where the cloud LLM is still relevant

A few cases.

For research workflows operating on de-identified images and reports. Some radiology research can be designed to remove PHI before any data leaves the controlled environment.

For specialized analytical tasks that draw on broader medical knowledge than the institution's own corpus contains — say, rare disease pattern recognition that benefits from the broader training of a frontier model.

For pilot deployments and early validation work where the volume doesn't justify the infrastructure investment.

For routine high-volume radiology workflows, the cloud LLM is structurally inadmissible in most jurisdictions, and the local SLM is the only architecture that fits.

The pattern, in healthcare's most AI-mature specialty

Avery NXR is a Next.js scaffolding tool. It is not a radiology tool. The architectural pattern repeats, with the privacy and integration realities making the case unusually strong.

Radiology AI is a narrow, repetitive, high-volume, HIPAA-protected, latency-critical, audit-relevant workload. The economics that favor a specialized local model for code scaffolding favor a specialized local model for radiology reports. The privacy framework forecloses most cloud LLM architectures. The integration with existing PACS-based infrastructure favors local deployment.

The radiology AI vendors that build on local infrastructure — with appropriate fine-tuning by subspecialty, PACS-native deployment, and HIPAA evidence packages — will own the institutional radiology market. The cloud-LLM-default products are operating against the grain of how radiology actually works at scale.

The pattern continues. Radiology is one of the workflows where every dimension of the local-SLM argument is at maximum strength simultaneously — cost, privacy, latency, audit, and integration. We expect the architectural shift here to be relatively complete within the next two to three years.