Unified Manufacturing Data Architecture

Unlocking Industrial Data.
Enabling Scalable AI.

The UMDA Framework

The Unified Manufacturing Data Architecture weaves interoperating layers into one vendor-neutral structure. The main components include Edge Intelligence Hubs, domain-specific Common Data Models, a real-time Unified Namespace, a governed Unified Data Layer, a Feedback Data Layer, and a layer dedicated to AI Routing & Agents. From millisecond sensor events to enterprise-wide KPIs and closed-loop AI learning, each layer owns a clear contract, governance model, and security posture. Together they deliver traceable, compliant, and scalable manufacturing intelligence across every plant, partner, and cloud.

Edge Intelligence Hub (EIH)

The Edge Intelligence Hub is UMDA’s on-premise nerve center. It ingests raw signals, enriches them with site context, enforces local data contracts, and can host lightweight AI models that analyze anomalies or predict failures locally without requiring off-site processing.

  • Multi-Protocol Ingestion — native drivers for OPC UA, MQTT, Sparkplug B, Modbus, REST, and legacy PLC protocols.
  • Low-Latency Buffer — sub-second caching and store-and-forward to survive network drops without data loss.
  • Inline Contract Validation — checks payload shape, units, and timestamp precision against CDM JSON Schemas.
  • Context Tagging — attaches BatchID, OrderID, and EquipmentID via ISA-95 object maps before publish.
  • Publish to UNS — streams enriched events into Site/Line/Equipment/Event topics for zero-copy routing.
  • Edge ML Inference — optional GPU/TPU slot runs lightweight vibration or vision models locally.
  • Secure Zero-Trust Gateway — TLS, mutual-cert auth, local secrets vault, and firewall whitelisting.
  • High Availability — active/standby nodes with automatic fail-over and heartbeat monitoring.
  • Containerized Lifecycle — signed images managed by edge orchestration (K3s / Greengrass / IoT Hub) for safe patching.

Common Data Models (CDM)

In UMDA, Common Data Models are more than site data schemas. Each manufacturing domain (Quality, Maintenance, Production, Supply Chain, etc.) owns a CDM that standardizes its vocabulary, units, and business rules. These domain CDMs act as the semantic foundation that connects real-time edge events to enterprise analytics in the Unified Data Layer (UDL).

  • Domain Vocabulary — harmonizes terminology (e.g. Deviation vs Exception) across CMOs and internal sites.
  • JSON Data Contracts — each CDM field is enforced via machine-readable contracts for schema, units, and latency SLAs.
  • ISA-95 / ISA-88 Mapping — aligns equipment, recipes, lots, and order objects to industry standards.
  • UNS Topic Naming — CDM IDs drive topic hierarchies in the Unified Namespace for zero-copy pub/sub.
  • Governance Anchor — ownership, lineage, and stewardship policies are tied to CDM tables.
  • Master & Reference Data Binding — links ERP master data (materials, suppliers, specs) to live production signals.
  • Unit Conversion Rules — stores canonical units plus rules for on-the-fly conversion at the edge or UDL.
  • AI Feature Store — versioned CDM tables feed feature engineering pipelines for predictive models.
  • Cross-Site KPI Joins — consistent keys let the UDL aggregate OEE, Cycle Time, and Quality metrics enterprise-wide.
  • Version & Lifecycle Control — CDMs are versioned so historical data remains queryable even after schema evolution.

Unified Namespace (UNS)

The UNS is UMDA’s real-time event facilitator. All Production CDM tagged messages are published to a hierarchical topic structure (e.g. Site/Line/Equipment/Event) enabling zero-copy routing from edge to cloud.

  • CDM Aligned Topics — topic names embed BatchID, StageID, and EquipmentID so consumers always inherit context.
  • Zero-Copy Fan-Out — one publish serves MES, historian bridges, AI agents, and dashboards simultaneously.
  • Retained Messages — last-known value caching ensures late subscribers receive the most recent state instantly.
  • Schema Registry Link — topic-to-CDM mapping is stored in the governance catalog for lineage.
  • Security Envelope — TLS, mutual certs, and per-topic ACLs enforce zero-trust at the edge.
  • Backpressure & Buffering — EIH store-and-forward protects UNS brokers during network loss.
  • Cloud Bridge — optional connectors replicate selected topics to public-cloud IoT hubs for analytics.
  • Metadata Topics — health, heartbeat, and contract-version topics support observability.
  • Stateless Consumers — downstream services can restart anywhere and resubscribe without losing sequence numbers.

Unified Data Layer (UDL)

The UDL is UMDA’s enterprise backbone. It consolidates domain-level Common Data Models, preserves full lineage, and exposes governed analytics to every plant, partner, and AI service.

  • Cross-Site KPI Warehouse — aggregates OEE, Cycle Time, Yield, and Quality metrics across all factories and CMOs.
  • Time-Series & Relational Joins — combines historian signals with CDM tables for “digital thread” queries.
  • Governed Data Lakehouse — CDM tables land in open formats (Parquet/Delta/Iceberg) with role-based access and row-level lineage.
  • Contract Enforcement — validates incoming payloads against JSON Schemas and SLA timers defined at the CDM layer.
  • Semantic Layer Registry — publishes logical models (dimensions, measures) for BI and self-service analytics.
  • AI Feature Store — versioned, point-in-time snapshots feed training, inference, and drift monitoring pipelines.
  • Streaming & Batch APIs — supports ANSI SQL, GraphQL, and pub/sub endpoints for real-time or retrospective analysis.
  • Feedback Loop Sink — captures outcomes from the Feedback Data Layer to close the learning cycle and update KPIs.
  • Retention & Archival Policies — tiered storage aligns with GMP/GXP retention requirements (e.g., 5, 10, 30 years).
  • Interoperability — exports curated views to UNS topics or external data sharing zones without breaking lineage.

Feedback Data Layer (FDL)

The FDL is UMDA’s closed-loop memory bank. It captures every AI inference, operator response, and real-world outcome, creating a trusted dataset for continuous improvement and regulatory audit.

  • Inference Registry — logs model ID, version, feature vector hash, confidence score, and timestamp for every prediction.
  • Human-in-the-Loop Feedback — records operator accept / override actions with reason codes and electronic signatures.
  • Outcome Metrics — stores actual process results (e.g. batch dispositions, downtime minutes) for accuracy back-testing.
  • Drift Monitoring Snapshots — captures feature distribution deltas to flag model drift or data quality issues.
  • Model Lineage & Versioning — ties predictions to Git commits or ML-flow registry entries for full traceability.
  • Retraining Dataset Builder — auto-curates labelled records for periodic model retraining and A/B tests.
  • Contract Evolution Insights — surfaces recurring schema mismatches to CDM owners for contract updates.
  • Query & API Layer — GraphQL / REST endpoints deliver feedback statistics to BI dashboards and MLOps pipelines.

LLM Routing & AI Agents

The LLM Router orchestrates tasks across lightweight, domain-specific, and general purpose language models, while autonomous AI agents use CDM context to detect anomalies, recommend actions, and write results back to the Feedback Data Layer.

  • Dynamic Model Selection — chooses the optimal LLM based on task complexity, domain, latency budget, and cost.
  • Context Injection — pulls batch, equipment, and parameter context from CDM tables before prompt assembly.
  • Horizontal Agent Collaboration — quality, maintenance, and supply-chain agents share intermediate results for cross-functional insights.
  • Vertical Escalation — if an edge agent cannot resolve an issue, the router escalates to enterprise models in the UDL.
  • Anomaly Detection & RCA — pattern-matching agents flag deviations, then root-cause agents correlate CPPs, material lots, and Work-Orders.
  • CAPA Generation — auto-drafts Corrective & Preventive Action steps, routed to human approvers via QMS integration.
  • Latency Tiers — sub-second local inferences at the EIH, multi-second domain LLMs in UNS, and deeper analysis in the cloud.
  • Human In-the-Loop Fallback — tasks with low confidence are routed to SMEs for review, with decisions logged in FDL.
  • Outcome Write-Back — agent conclusions, user overrides, and effectiveness metrics are stored in the Feedback Data Layer for retraining.
  • Policy & Guardrails — prompt filtering, PII masking, and token-level audit logs ensure secure, compliant generation.

Ready to Plan Your Rollout?