The algorithms are not
the problem. The data is.
Why is predictive maintenance so hard? Why do AI pilots die on the shopfloor? Because we keep reinventing the wheel for every data project. It's time for platform thinking.
Every data project starts from zero
Every data project becomes a massive undertaking because you need to untangle the data spaghetti over and over again. Finding data, integrating it, cleaning it, adding context — that cycle eats 60% to 80% of every project's time before anyone gets to the actual analysis.
or stall before completion
finding & cleaning data
linear with more budget
For the 30% that succeed, learnings and results typically stay locked within the project scope. And the data engineers who glue everything together? They've become the bottleneck for every report, trend, and calculation being requested.
The Good, The Bad, and The Ugly
The problem isn't starting — it's scaling. Making something work once, then making it work again somewhere else, and again, and again. Most organisations we work with have launched multiple data initiatives. The question is: which line does your organisation follow?
The Scaling Effect
The ideal scenario. Early progress moves slowly because you're building a proper foundation. But as time passes, each subsequent project becomes easier. The fifth project launches faster than the fourth. The tenth faster still.
Data connections get established once and serve multiple use cases. Quality improvements made in one dataset benefit everyone. Knowledge compounds — lessons learned on Line A accelerate work on Line B. What took three months initially now takes three weeks. The organisation crosses a tipping point where digital capabilities spread naturally.
Stalling
The more common scenario. Early wins come through shortcuts: custom scripts, manual exports, temporary workarounds that become permanent. Someone writes a PowerShell script to export data via FTP. Another builds a Power BI dashboard querying the MES directly — unknowingly slowing down the entire application. A third creates Excel macros so complex that nobody else understands them.
None of these solutions outlive the original creator. When that person leaves, the solution becomes a mystery box nobody dares touch but everyone depends on. The line flattens because each new project fights accumulated complexity rather than building on solid foundations. At a certain point, this way of working always ends in a gridlock.
Pilot Purgatory
Brief spikes of activity followed by flatlines. A pilot launches with a tight scope and delivers early success — only to die quietly because nobody owns ongoing maintenance and the solution was never designed to last. The data science model that predicted equipment failures brilliantly for three months, then stopped working because someone changed some setpoints or simply because winter turned into summer.
These are "innovation theatre" projects: impressive in presentation decks, invisible in operational reality. The result is a cemetery of abandoned pilots — each one technically successful but none of them surviving operational reality. Speed without sustainability produces exactly this.
The Industrial Data Platform Capability Map
The only way to truly scale is to break free from the cycle of rebuilding custom data pipelines. We developed a Capability Map that captures the essential building blocks of any Industrial Data Platform. It's not a product — it's a framework for the right conversation.
Every platform begins with getting the data in. In industrial environments, that means a secure, scalable connectivity layer speaking the many languages of industrial systems — old and new. A brand-new IIoT sensor streaming over MQTT next to a 30-year-old PLC on Modbus RTU. High-throughput for defect detection, redundancy for critical processes, buffering during connectivity loss.
Raw data alone is not enough. This capability links unstructured values to real-world meaning: asset hierarchies, production batches, maintenance events. ISA-95/88 aren't academic exercises — they're proven structures capturing decades of manufacturing wisdom. Context is what transforms L15.T01A.PV into "the baking temperature during batch 2025-W03."
Sensor data is notoriously messy: outliers, flatlines, NULL values, calibration drift, missing metadata. These issues are often invisible until you're deep into analysis. Data quality should be tracked, scored, and exposed as part of the platform — not addressed on an ad-hoc basis. Without it, your AI models become "The Oracle" that operators abandon the moment nobody is watching.
Keeping your data — and keeping it accessible. A high-performance time-series store at the core, plus event/alarm storage and publish-subscribe via MQTT. Layered storage following the medallion pattern: Bronze (raw), Silver (cleaned), Gold (analysis-ready). The broker connects all the dots in real-time — it's not a historian replacement.
Sometimes data is most valuable processed close to the source. Edge analytics can be as simple as computing statistical summaries or as advanced as running ML models on video feeds in real-time. Virtual tags, batch analytics, anomaly detection, predictive maintenance — all become possible here. Preprocessing at the edge also reduces load and minimises network traffic.
If data isn't presented in a meaningful way, does it matter? Most users will never touch connectivity or data models — but they will use dashboards and reports. Whether they're process engineers monitoring KPIs or operators reviewing last week's performance, their experience needs to be seamless. Rich visualisations, easy sharing, and collaborative tools that encourage discovery.
No platform exists in a vacuum. Exposing REST APIs for data science tools, sending curated datasets to the corporate data warehouse, enabling digital twin connections — sharing is critical to scale. This is also where MCP (Model Context Protocol) enters: a standardised way for AI agents to query maintenance history, pull real-time sensor data, or update work orders. Experimental, powerful — but guardrails are non-negotiable.
See the concepts in action
This is not a technology problem alone
We have enough technology available. The vendors in our DataOps Vendor Database are ready to help you implement. But technology without cooperation is just expensive shelfware.
The real challenge is organisational: getting IT and OT to speak the same language, agreeing on who owns the data, building a team that understands both the shopfloor and the cloud. If your organisation operates in silos, your platform will end up fragmented too — that's Conway's Law in action.
That's exactly why we built the ITOT.Academy — a 6-week live online course where IT and OT practitioners learn the frameworks, vocabulary, and cooperation models to push past "just a POC." Built by practitioners, for practitioners. Short, to the point, and brutally real. Our first 40 students scored it 9 out of 10.
Explore the ITOT.Academy