Avery.Software — Native Execution Runtime
RuntimeUse casesPricingHelpBlog
← All postsBlog

Banking back-office operations: AI on the most regulated workflows in finance

2026-06-01 · Avery NXR

Banking back-office operations are where the regulated machinery of finance actually runs. KYC (know your customer) onboarding. AML (anti-money laundering) monitoring. Trade reconciliation. Settlement processing. Customer due diligence updates. Sanctions screening. Document verification. Account opening. Wire transfer review.

Every one of these workflows has been thoroughly AI-augmented in the past three years. The volume at any major bank is enormous. The data is the most regulated in finance. And the case for moving inference local is one of the cleanest in any operational category.

The work

Banking back-office AI workloads include several distinct categories.

KYC and customer onboarding: collecting and verifying customer information, classifying customer risk, generating customer profiles, drafting onboarding decisions. Volume scales with new account flow.

AML monitoring: analyzing transaction patterns for suspicious activity, drafting Suspicious Activity Reports (SARs) when warranted, escalating cases to investigators, maintaining the audit trail required by regulators.

Trade reconciliation and settlement: matching trades across counterparties, identifying breaks, drafting break investigations, producing settlement documentation.

Customer due diligence updates: periodically re-evaluating existing customers, drafting updated risk assessments, identifying changes that require enhanced due diligence.

Sanctions screening: comparing transactions and counterparties against sanctions lists, flagging matches, drafting investigation notes, producing the documentation regulators expect.

Customer service back-office: processing inbound requests that require investigation — disputes, fraud claims, account modifications, complex inquiries that can't be resolved at the first line.

Regulatory reporting: drafting and submitting the various reports that regulators expect at specified cadences — Call Reports, Y-9s, FATCA, CRS, suspicious activity reports.

The math

A representative midsize bank — say, a regional with twenty billion in assets — generates a meaningful AI workload across these functions.

KYC and account opening: tens of thousands of new accounts per month, each requiring multiple AI operations. AML monitoring: millions of transactions per day flowing through monitoring systems, with the AI-augmented investigation layer touching tens of thousands per month. Sanctions screening: every payment, in real time. Trade ops: tens of thousands of trades per day at any institution with capital markets activity.

Aggregate AI workload at a midsize bank: in the high tens of millions to low hundreds of millions of tokens per month. At frontier pricing, the bill is in the low to mid six figures per year.

For large multinational banks, the numbers scale dramatically. Major global banks process this workload at orders of magnitude larger scale, with bills approaching or exceeding seven figures per year. For the largest banks, especially those with substantial wealth management and capital markets operations, the AI back-office bill can be a meaningful operational expense.

Why banking is structurally a local-SLM case

The properties that favor local inference are all present, with several at the extreme of any operational category.

The work is narrow within the institution. Each bank has its specific products, customer base, transaction patterns, and risk frameworks. A model fine-tuned on the bank's own corpus outperforms a general model on the bank's specific work.

The work is repetitive. KYC profiles follow predictable structures. AML investigations follow predictable patterns. Reconciliation breaks cluster into a small number of categories. Specialization compounds.

The volume is enormous and grows with the bank's business activity.

The privacy and regulatory framework is the strictest of any operational domain. Banking is regulated by multiple frameworks simultaneously in most jurisdictions — banking regulators (OCC, Fed, FDIC in the US, equivalents globally), financial crime authorities (FinCEN, FATF), data protection frameworks (GLBA, GDPR), sectoral frameworks for specific products. Sending banking data to third-party cloud LLMs creates compliance posture that the bank's regulators and chief compliance officer take very seriously.

The audit trail is mandatory. Banking operations are subject to constant audit — internal audit, external audit, regulatory examination, third-party validation. The audit trail of how AI makes decisions becomes part of the evidence record for every audit.

The latency story matters in real-time workflows. Sanctions screening, fraud detection, and customer-facing transactions need sub-second response. Cloud LLM latency is often incompatible.

What changes with local inference

A banking AI workflow on a local SLM looks like this.

A model is fine-tuned on the bank's operational corpus — customer profiles, transaction histories, AML cases, regulatory submissions, internal investigation files. The fine-tuning happens in a compliance-controlled environment that respects the data sensitivity.

The model deploys on infrastructure the bank controls — on-premises in data centers, in regulated private clouds meeting bank-specific compliance frameworks. The deployment is documented, validated, and audited.

Banking operations flow through the inference pipeline within the bank's controlled environment. KYC profiles, AML cases, reconciliation breaks, sanctions matches, regulatory drafts — all produced by the local model.

The cost flips from per-operation to fixed. Account growth, transaction growth, and business activity can scale without the AI bill scaling.

The regulatory conversation gets easier. The bank can demonstrate to examiners and to its own compliance functions that customer and transaction data stays inside the bank's controlled environment, that the model is governed by the bank's own controls, and that the audit trail is local and reviewable.

What the regulator is watching

Banking regulators are actively focused on AI use in regulated workflows. The questions being asked in supervisory examinations and in industry guidance map cleanly onto the cloud-vs-local distinction.

How is model risk being managed? Banks operate under model risk management frameworks (SR 11-7 in the US, equivalent guidance elsewhere) that require specific controls over how models are developed, validated, monitored, and governed. Local deployment supports these controls in ways cloud LLM use often does not.

How is customer data being protected? GLBA in the US and equivalent frameworks elsewhere create specific obligations about customer data handling. Cloud LLM use raises questions that local inference does not.

How is the AI's decision-making auditable? Regulators want to see the evidence of how AI is being used in regulated decisions. Local models with structured audit logs produce evidence that maps onto what examiners want to see.

Where the cloud LLM is still defensible

A narrow set of cases.

For workflows operating on fully aggregated or de-identified data without customer-level information. Some risk analytics and reporting can be designed this way.

For internal training and knowledge management content that doesn't touch customer or transaction data.

For specific cloud LLM services that have explicit regulatory authorization and have been validated for specific banking use cases (this exists in narrow forms today and may expand over time).

For the bulk of banking back-office work — KYC, AML, reconciliation, sanctions, regulatory reporting, customer due diligence — the cloud LLM is structurally difficult and the local-SLM architecture is the only one that fits the regulatory environment cleanly.

The pattern, in banking

Avery NXR is not a banking tool. It scaffolds Next.js applications. The architectural pattern repeats, with the regulatory and audit dimensions making the case unusually strong.

Banking back-office AI is a narrow, repetitive, extreme-volume, extreme-regulation, extreme-audit workload. The cost case is real. The privacy case is mandated. The regulatory case is intensifying. The audit case is mandatory.

The banking AI vendors that build on local infrastructure — with appropriate fine-tuning across banking functions, deployment models that fit existing bank compliance frameworks, and evidence packages that satisfy banking regulators — will own the institutional banking AI market. The cloud-LLM-default products will hold some segments but face structural friction with bank regulation as supervisory expectations tighten.

The pattern continues. Banking is one of the workflows where the architectural shift to local inference is being driven primarily by the regulatory framework, reinforced by the cost and audit dimensions. Banks that move first will be ahead on regulatory standing and on operational cost simultaneously.