Manufacturing quality control and work instructions: AI on the factory floor
· Avery NXR
Manufacturing has been adopting AI more cautiously than most other industries — but the adoption is real, and it has a different shape than the AI deployments in office workflows. The work happens in factories. The data is operational. The latency requirements are real. The network is often unreliable. The privacy constraints are about competitive operational intelligence, not personal information.
For all these reasons, manufacturing is one of the categories where the local-SLM case is unusually clean. The cloud LLM is, in many factory contexts, simply the wrong architecture — not because of cost or privacy in the standard senses, but because the operational realities of a factory don't match the cloud-first deployment assumptions.
The work
Manufacturing AI workflows include several distinct categories.
Quality control: classifying defects from images or sensor data. Identifying root causes in production deviations. Generating quality reports for downstream supply chain customers.
Work instruction generation: producing assembly instructions, standard operating procedures, and training documentation for workers. Translating these into the languages of the workforce. Personalizing them for skill level.
Maintenance log analysis: reading work orders, failure reports, and maintenance histories. Identifying patterns in equipment failures. Suggesting preventive maintenance schedules.
Supplier quality: analyzing supplier quality reports, certificates of conformance, and inspection records. Flagging deviations and root causes.
Internal Q&A on operational knowledge: a worker on the floor asking a question about a specific machine, a specific procedure, or a specific past incident, and getting an answer drawn from the plant's accumulated operational knowledge.
Each category has different volume profiles and different constraints, but they share some common properties that distinguish manufacturing from the office workflows we've covered in other posts.
What's different about manufacturing
The factory environment changes the AI deployment math in several ways.
Network reliability is poor. Many factories have intermittent network connectivity, particularly on the floor itself, particularly in older facilities. A cloud-LLM workflow that depends on a reliable round trip to a data center will fail in factory conditions. A local-SLM workflow that runs on the device or on a server in the building works regardless of upstream connectivity.
Latency is critical in operational contexts. A worker at a station who scans a part barcode and asks the AI a question needs an answer in seconds, not minutes. A cloud round trip with poor connectivity may take a minute or longer; local inference takes hundreds of milliseconds.
The data is operational IP. Manufacturing data — defect patterns, maintenance histories, production parameters, supplier quality data — is among the most competitively valuable data a manufacturer owns. Sending it to a third-party cloud LLM exposes the kind of operational know-how that companies spend decades building and aggressively protect.
The deployment patterns are different. Factories often have edge servers already deployed for MES (manufacturing execution systems) and SCADA (supervisory control). Adding a local-SLM workload to existing edge infrastructure is operationally simpler than provisioning new cloud accounts.
The compliance frameworks are sector-specific. Aerospace manufacturers (AS9100), automotive (IATF 16949), medical devices (ISO 13485), defense, food and beverage — each has its own quality and traceability frameworks that constrain how data is handled. The cloud-LLM architecture often fits poorly into these frameworks.
The math
The cost numbers in manufacturing are usually smaller than in pure-document categories — the per-operation token counts are often lower, and the per-operation volume is bounded by physical production cycles. But the math still adds up at scale.
A representative mid-sized manufacturer with ten production lines and three hundred workers might run a few thousand AI operations per day — quality classifications, work instruction lookups, maintenance log entries, supplier reports. At a representative cost of $0.015 per operation, that's about $20 per day, or $7,500 per year, for one plant.
For a multi-plant manufacturer or a contract manufacturer running dozens of lines, the bill scales to the low six figures per year. For very large manufacturing operations — aerospace primes, automotive OEMs, electronics contract manufacturers — the bill is in the high six figures per year.
These numbers are not enormous, but they grow with production volume, and they exclude the upstream sensor and vision systems that generate the input data. The AI augmentation layer on top is the line item we're examining.
Why this is structurally a local-SLM case
The properties are all present, with the operational properties unusually strong.
The work is narrow. Each manufacturing operation has its own equipment, its own processes, its own defect patterns. A model trained on the plant's operational corpus dramatically outperforms a general model.
The work is repetitive. The same kinds of defects, the same kinds of work instructions, the same kinds of maintenance events, repeated across millions of production cycles.
The latency is structurally critical in operational contexts. Cloud isn't compatible with the network reality of most factories.
The privacy is competitive-intelligence-grade. The data is operational IP that manufacturers protect carefully.
The deployment fits existing edge infrastructure better than cloud-only architectures.
What changes with local inference
A manufacturing AI workflow on a local SLM looks like this.
A model is fine-tuned on the plant's operational corpus — historical defect classifications, work instructions, maintenance logs, supplier quality reports. The fine-tune captures the plant's specific operational patterns.
The model is deployed on edge infrastructure inside the plant — typically the same edge servers that already run MES, SCADA, and vision systems. Network connectivity to the cloud is needed for updates but not for routine operation.
Operations flow through the inference pipeline at the edge. Quality images get classified in milliseconds. Worker questions get answered from local knowledge. Maintenance logs get analyzed locally. The factory operates whether or not the WAN is up.
The cost flips from per-operation to fixed. Production volume can scale without the bill scaling.
The operational IP stays inside the plant. The accumulated knowledge of how this specific manufacturing operation works does not get exposed to a third-party AI provider.
When the cloud LLM is still acceptable
A few cases.
For very small manufacturers without the edge infrastructure or ML investment to deploy local. The cost-benefit may not pay back at small volumes.
For research-and-development workflows where the data is exploratory and not yet operational IP. Cloud may be acceptable in early-stage process development.
For supplier collaboration workflows that explicitly span organizational boundaries — analyzing shared data across multiple manufacturers in a supply chain. The boundary-crossing nature of these workflows complicates the local-only model.
For the bulk of in-plant manufacturing AI — quality control, work instructions, maintenance, internal Q&A — the local-SLM case is strong, and often the only architecture that works in factory conditions.
The pattern, in industrial contexts
Avery NXR is a Next.js scaffolding tool. It is not a manufacturing tool. The architectural pattern repeats, in an unusual variant.
Manufacturing AI is a narrow, repetitive, latency-critical, IP-sensitive, edge-deployed workload. The economics that favor a specialized local model for code scaffolding favor a specialized local model for manufacturing — but with the additional argument that the operational realities of a factory make cloud-LLM architectures structurally difficult, separate from the cost and privacy arguments.
The Industry 4.0 vendors that build excellent local-inference tools — with appropriate fine-tuning, edge deployment models that match existing factory infrastructure, and compliance frameworks that align with the sector-specific quality systems — will own the manufacturing AI category. The cloud-LLM-default products are operating against the grain of the factory environment.
The pattern continues. Manufacturing is one of the workflows where the local-SLM case is foundational because the cloud architecture doesn't fit the operational context, not just because it's expensive. Plants that adopt local-first AI will have more reliable operations, better operational quality, and protected competitive IP simultaneously.