Industrial DataOps #1 with David & Willem - Deep dive into our capability map, data management and more

Join the Insider! Subscribe today to receive our weekly insights

This article was originally published on our Substack. Read it here

Willem (00:00)
Welcome, you’re listening to the ITOT Insider Podcast. I’m your host Willem and in this special series on industrial data ops, I’m joined by my co-author David. Subscribe to this podcast and our blog to get the latest insights shaping the world of industrial data and AI. So welcome David.

David (00:18)
Thank you, Willem. It’s good now to be for a change on the side of the interviewee. looking forward to this talk.

Willem (00:23)
It’s a change of roles. Well, like you know, we have a very interesting series of interviews that we’re doing in coming five to six weeks with a lot of different companies, interesting people.

David (00:40)
Yeah, absolutely, absolutely. So after we published our industrial data platform capability map end of last year, we got tons of inputs, people who read it, people who provided input like, okay, maybe you should consider doing this or consider doing that. But I would say the most important thing we heard from our readers was a…

Can we get some more insights? Can we hear you guys talking to vendors who are active in the data domain? So that’s exactly what we are going to do in this series in the next couple of weeks. So we selected a couple of very interesting vendors to talk about their product, to talk about data. And before, because in the next episode, we will publish the very first interview.

But before we do that, we thought it would make sense to have this short chat about this thing which we call an industrial data platform capability map.

Willem (01:48)
Yeah,

it’s a very long name. So just taking a step back, why did you do that? Why did you start working on this and what’s it used for?

David (01:57)
Yeah.

The thing is that, you know, I’ve been going to customer meetings in the last years and the question is typically always the same, like, okay, we need to do something with data. We need a solution either because they want to build a dashboard or they have an idea about a certain use case. They want to predict something. They want to bring their data to the cloud. They heard that

Cloud providers such and such is the next best thing. They want to do AI. There are these, I would say, these vague requirements and all of those requirements require data somehow. So when stepping into these conversations, I learned that I should step away from going like, okay, here is software solution A or B and that solves your problem.

But really stepping into, OK, what are the different, I would say, different capabilities, the different things you need to happen with that data until it becomes useful for you. And then also from, I would say, the other way around, if you look to websites or brochures or you go to trade shows of all those different players active in the data domain, which range from, I would say, really the traditional OT vendors to

startups and scale-ups which are active in the IT OT domain to I would say the pure IT vendors who are also now stepping into the operational manufacturing industries. Well, they typically all say the same. They say buy our product and all your problems are solved. So we needed something to have as a starting point to start a discussion.

Willem (03:49)
Is it a bit like what you call in IT a reference architecture?

David (03:53)
Yeah, absolutely. And it’s also bit evolving, I think. So last year we published the first map based on the input we had at that point in time. And then, well, I actually hope that after this series of interviews, I want to do an update of the architecture of these capabilities based on all the inputs we received and then send that back out to all our subscribers.

Willem (04:15)
Hehe.

David (04:21)
So yeah, maybe it’s a bit of a reference architecture, or hopefully it can become a reference architecture. That’s actually even better.

Willem (04:29)
a bit of a hybrid where you can start placing different concepts and ideas within the bigger picture before you get lost. Can you walk us through all that’s on that map? Because I see colors, I see squares, but I need a story.

David (04:35)
Yeah, absolutely. Yeah.

It.

Absolutely. So this is kind of a podcast slash video. So for the ones who are listening, just listening, I’m gonna try to describe it for the ones who are watching, obviously can also open the article. I will also add the link to the initial capability map in the show notes. So make sure to take a look and provide us with your input, provide us with your comments. They are always very, very much appreciated. So first things first, if you talk about a…

say a data domain, on the one hand, we start with data sources. Data sources in the operational world are typically SCADA systems PLCs. It can also be text files, can be engineering data, can be IoT sensors, which are maybe living in the clouds, can also be IT data, right? can be ERP data, can be etc. So you have data sources.

But the thing they have in common is they typically live in a variety of databases, in a variety of systems. You might be standardized on one certain vendor, but every site maybe runs on a different version of your software. So this is the first thing, you have your data sources. Now, on the other side, you have your data consumers. So you want to build applications.

You want to have your interesting trending tool. You want to maybe have a data science profile. want to build some cool stuff in their Python scripts. So they want to use that data. Now, we, I think, all know, everybody knows that when working with data, if you need to, I say, get that, extract that from all these different source systems, align the data, normalize it, clean it, contextualize it manually,

Well, basically that’s the largest portion of your data project. so a couple of years ago, when we were actually still working together, and I don’t know if you know this story, Willem, but I did this small test. So I asked a couple of people who were working on data project, was that full-time job, were data scientists. I actually asked them, can you somehow give me a time indication?

It takes you in a project. So how much time do you spend on the, I wouldn’t say the non-value at part, but at least the part where you’re working to get the data in the right formats. And how much time are you spending in actually solving the business problem? And I did that myself as well in the beginning of my career. And that’s about 70%. So 70 % of data.

person or the time a data person spends actually goes to integrating, cleaning, contextualizing data. And even worse, that’s typically a one-off problem, right? Because you do that for one use case. And then the next use case, you can typically start over again. Right. So to make that or to change that,

and go to a platform where we can really gain some scalability, where we can really speed up our data use cases, where we can also actually reduce the risk. Because if we spend 70 % of our time just in the preparation phase, that’s actually from a money perspective and a time perspective, a big risk. So we want to reduce that.

Willem (08:20)
Yeah, I think you

were mentioning like data profiles, but in production people are doing analysis all the time. Now those highly paid engineers, they’re also spending a lot of time on one of efforts and they do them rarely just because it takes so much time. So you don’t only have the lost cost of people spending their time doing menial jobs. You also have lost opportunity because it’s so much work. They don’t have time for it. So you’re not going to implement.

David (08:28)
Yeah.

Willem (08:50)
the same analysis for each batch, for each production line.

David (08:54)
know or you get these crazy big Excel sheets which some someone with some VB knowledge at a certain point in time in their career programs and in their.

Willem (09:02)
Yeah, yeah, yeah, I

know. then 15 years later, I always start the script. I don’t know why, but it has been handed down generation by generation and I still keep on doing it.

David (09:09)
Yeah.

Press the red button, that’s all I needed to do. Yeah, and the thing indeed is that approach basically stalls your data projects, full stop. So I think, and now especially also with AI becoming more more importantly, and at least becoming on the radar of many management teams, people see the need to come up, I would say, to change that approach and go to with a platform first approach.

Willem (09:14)
Yeah, my predecessor told me.

David (09:42)
And a platform-first approach means that we’re basically integrating these diverse data sources, these data sets, into one single platform, I would say the single source of truth somehow. And then our applications are built on top of that single source of truth. And that’s…

Willem (10:00)
Is that like

in contrast to, for example, you would say you have a platform approach? What would be the contrast? What’s the opposite of a platform approach?

David (10:09)
The, I would say the spaghetti. Yeah. Data spaghetti or interface spaghetti. And then ending up with hundreds, hundreds in, would say, and for the, for the bigger companies, even thousands of interfaces, which is an interesting thing because you might say like, okay, this interface, it solves my problem, but it also introduces technical debt, technical debt, that you now also need to

Willem (10:12)
The ad hoc approach, I have a use case here, a use case there, a use case somewhere else.

David (10:38)
maintain that interface over its lifecycle and the more of these custom interfaces you get the more complex it becomes to one build new interfaces but more importantly to make changes to your systems because they all depend on some kind of a custom build interface.

Willem (10:57)
also doesn’t scale. I

think in IT, number one, your applications need to scale. If you do it for one site, it should be something you can roll out to every site.

David (11:08)
Yeah, absolutely. So I would say given that context, it definitely makes sense to aim or to go for a central data platform somehow. Should that be an IT data platform or an OT data platform? It’s actually a totally different discussion. But I would say to start working or to start taking some concrete steps,

Instead of going like we’re going for technology A, B, or C, we’re going for a data lake, we’re going for a data warehouse, we’re going for an historian or whatever. What we do is we talk about capabilities. that’s maybe where I’d like to start with. So if you want to build that central data platform, that central data core, you need, I would say, we now say seven main capabilities somehow. And those capabilities,

can be part of one technical solution, but probably will be part of several technical solutions working in harmony somehow. So first of all, let’s go through them one by one. So number one, connectivity. Connectivity meaning that your data platform needs to be able to talk all kinds of different protocols, structured data, unstructured data, time series data, text files.

Willem (12:15)
Number one, number one.

David (12:34)
JSON files, using OPC, using MQTT, using Modbus, know, all this, both the, let’s say the newer technologies we see, but also legacy OT protocols, because yeah, that PLC is still sitting there. It has been sitting there for 10 years and it will be sitting there for another 10 years. yeah, absolutely deal with it.

Willem (12:58)
At least, if it works, works.

David (13:02)
So you need to be able to make this local connection in a secure way, in reliable way, in a fast way. Many of these source systems, also provide data at scale. For example, when we are integrating video data, when we’re integrating vibration data, might have kilo or megahertz data flowing in. So you need the connectivity site covered. And I would say that this is

typically where the OT companies, they excel at. They understand those protocols very well. So that’s number one. Number two is you might say, okay, now we need to send that data into the platform. Is that right? No, not yet. Because number two is the contextualization and data management requirements. And…

Contextualization, data management, is something which is so important but also overlooked most of the time. First of all, what is data management? Data management can be everything from applying a certain structure, a certain standard like the EISA 95 structure, where you have your sites and your lines and your assets, et et cetera. So some have a logical structure which depicts

the your facility can also be master data management, know, kilograms and pounds and that type of things. And then the contextualization is something which is really, really interesting. Contextualization in a manufacturing context means that somehow we are able to align the sensor data which flows in, what is actually happening on the shop floor.

which means that if you’re the example on the blog is typically the example of a cookie factory. So which means that I know I’ve been producing this type of cookie from then to then on that assets or that you know, for example, if it’s a quality system, you’ve taken a quality sample and you’re actually able to link that quality sample to the actual batch and then link that batch to I would say the actual production periods. This is something which

You know, it sounds straightforward, but it’s not that easy. It also requires you to make a lot of design choices. Do I store the pre-contextualized information or do I only store the raw information and will I do the contextualization only later on in my process? But it is a crucial factor because having this context available will actually prove

crucial for the data users at the end to really start making sense of your data because you want to help them just selecting like, okay, I now want to compare all my ovens of this type, for example.

Data quality is then the third capability. What is data quality? And we actually have an, we also had an article about that a year or so ago. And I will, let’s make sure that that’s also in the article and the show notes. Data quality is basically, I trust my data? It’s a bit about, is monitoring the data which flows in, does it contain spikes? Does it contain flat lines?

Is it noisy? it under sampled? Is it over sampled? All these fun things you can come across. But also, the data I’m storing, will I be storing all my data? Will I be storing clean data? Will I be just flagging data, which is, I would say, not what I expected. So data quality is also something important and it becomes also more more important.

the more we automate because if I make a report later on at the end, so that’s the final capability, right? But if I make a report and just the trends and I look to the trends and I see a clear spike, then my first reaction will be, yeah, the data is not correct. I’m gonna filter out the spike or I’m gonna ignore it. But once I start automating my reports, once I start automating my KPI calculations,

or maybe even starting to feed back information into the process in an automatic way. Well, now I’m not, I don’t have this visual overview anymore. So I need to be able to trust my data now.

Willem (17:47)
If you’re starting to bring it up at scale, you cannot trust on humanize and reason anymore to filter out the noise from your signal. yeah, spikes like that could disrupt your algorithm or your dashboard.

David (18:02)
Absolutely. And it can just be like, for example, an OEE calculation or whatever, you know, the data which flows in is, I don’t know, two, two, two, two, two, 50 million, two, two, two, two, two, Yeah, if you’re not able to filter out that 50 million in an easy way, then every report which comes down the line will be wrong. So it’s an important one. It’s also a rather new one. Data quality is gaining some momentum right now, but I would say it’s

Willem (18:23)
We’ll be wrong. Okay.

David (18:32)
It wasn’t really part of a typical data platform until now. So that’s a new one.

Willem (18:39)
Okay,

so we have connectivity data coming in. We’re contextualizing it. We’re giving it some meaning. It’s not just a data point. It fits somewhere. We ensure that it fits certain criteria in terms of quality. So we can start using it at scale. So we can start trusting the data we’re working with when we’re building solutions. Can we then store it? Okay.

David (18:41)
Yep, it’s good.

Yeah. Yes. Okay.

Finally. Yeah. So this is where our fourth capability is the data broker and data store. This is actually an interesting thing because a data broker, it’s something which we hear, for example, when we talk about MQTT brokers, they typically just, would say they are some kind of a platform, but they typically just hold the current value of the data.

So I’m able to retrieve the current value of the data, but not the historical data. So that means that we either need a broker and a store separately or a broker and a store which is integrated. And here is also, I would say, a very interesting trade-off to be made. If we take a look to the more OT-centric or the OT-focused solution providers,

then this store, you will typically store your raw data as it flows in. And those time series stores, yeah, they will be optimized to work with time series data. So they will be highly optimized, highly compressed. It’s easy, Willem, to store data in the database, but it’s much, much, much harder to also retrieve data at scale out of those systems.

And this is also a bit rare. A lot of the more traditional cloud providers, well, they always say, okay, you can store every data you want in my system. But if it becomes super expensive to have your data there and even more expensive to extract your data every time you need it, then the business case can become very negative. in this store, when you’re storing raw data, you need, I would say, an optimized way to

Willem (20:33)
You can.

David (20:51)
to work with manufacturing data. That’s critical. More and more, we see these concepts also from the data lake or Delta Lake world flowing into our world, where they talk about bronze, silver and gold stores. A bronze store typically is the raw data. A silver store typically is some kind of curated, validated, cleansed Data which I would say…

You can start building your reports on top of this store. And then a gold store typically is our prepared datasets. Like for example, a prepared KPI dataset or something like that. Why is that? Or why is that changing? It’s in the old OT world where we only had the bronze store, so the raw store. That’s good. But if I have a BI report who…

A BI report needs a tabular format somehow. So you need to have a tabular format to request that data and then send it into or ingest it into the report. Then you also somehow need to prepare data sets.

Willem (22:04)
I think also in terms of, I mean, very concretely, you’re talking about reports. Sometimes you would need some hourly data, for example. You could start sending raw sensor data and let the system do all the calculations that are needed to make that hourly summary that they’re using for their report or calculations. Or you could have a system that’s maybe part of your platform that’s specialized in this, deliver that hourly report.

David (22:32)
Yeah, absolutely. And do you want to go for a single source of truth? Do you want to copy the data or not? Those are all discussions you need to make, but those should be use case driven and not really technology driven in my case or in my opinion. So this is the store. This is really what’s central. This is the, I would say that’s the thing all applications interact with. The tangible thing, yeah.

Willem (22:33)
Probably it’s easier on a platform. Wait for that.

The tangible thing, it’s very tangible. It’s where the data sits.

David (23:00)
It’s also, especially when it on service, you can see the service hole in your data or you get the bills if it’s consumption based. But there are a couple of things to add here. One very specific capability we need to add is the possibility to do edge calculations, edge analytics. Why is that?

Willem (23:06)
especially in the clouds. It’s cloud, you just see the bills.

David (23:28)
Obviously, you could build a report or a calculation somewhere further down your pipeline running on one of the hyperscalers. But there are many, many applications where you need your calculations to be running very close to the data sources. An example could be indeed the video feed I mentioned.

you might not want to store your high frequency video data. Maybe you just want to analyze that using some computer vision algorithms at the edge and you only want to store your calculated results. For example, if it’s something to detect an anomaly, then you maybe want to store the anomaly information and not the raw video feeds.

Or maybe you also want to feed back that calculation immediately to the control layer. then definitely it needs to run on the edge. But there are so many other examples there as well. So having this close to edge analytics capabilities available is our fifth capability. And then we have on the application, the user side, we have two more, which is

data sharing and data visualization. So let’s start with data visualization. That’s our sixth one. Data visualization means just, you know, being able to quickly either in a trend or in a report, see the data you need to see. Making sure that it’s available for everybody in the organization, not just data teams. You don’t want your data teams to be the bottleneck if your operators just wanna see, I don’t know,

the temperature profile of the last 24 hours or something like that. And then our seventh and last capability is data sharing. Data sharing basically means opening API type of things where you can have all the other applications. So all applications which interact with the platform to be able to read and write whatever they need to read and write.

can be on-prem, can be in the cloud, but you need to manage data sharing as well because from a platform point of view, the data sharing capability should also be part of the platform itself and the team who is managing that.

Willem (26:03)
If you’re just keeping the data on the system and just using that system to create a couple of reports, think you’re going to miss out on a lot because there’s tons of applications that want that data.

David (26:10)
Yeah. Yeah.

So those are the seven capabilities, Willem.

Willem (26:17)
Okay,

so I’m gonna pick a couple of those out to go maybe a bit deeper or ask a lot of a few questions about them. The first one is number two, contextualization and data management. So is this what people mean when they’re talking about UNS? Or is it number four?

David (26:36)
Yeah, UNS unified.

Yeah, very good point. So number four was data broker, number two was data contextualization, number one was connectivity. Well, if we talk about UNS, you talk about those three. So a unified namespace, it’s a concept, very important. It’s not a product. UNS is also not a capability on its own.

I deliberately didn’t add unified namespace as a capability to this map because it’s a concept. It’s a concept where you integrate different data sources into a central broker and you contextualize it. Can you do that, I don’t know, with a certain product? Yeah, you can do that with different products. But I think it is more about, well…

It’s not really about technology, actually. It’s about data management.

Willem (27:36)
The core

in essence, mean the connectivity and the data brokerage, the sharing part I’d say of the data. But at its core it’s also about that namespace, giving a name to your tanks, giving a context to them.

David (27:39)
Yeah.

Yeah.

Yep.

Yeah, and that’s why data management is so important because I think there is enough technology available right now to connect to data sources. But for example, starts with where does my single source of truth live? Will I be trying to change my tag names or to update my data? I would say preferably close to source, but…

The thing is, if you buy machines, typically those machines, they will come with a PLC. PLC will be pre-programmed and basically you just have to deal with whatever you received, right? So that means that it’s also not always possible to change tag names, to change if it would be MQTT to define your topic close to source.

So somehow you still need that data management capability in the middle. And then the question becomes who will be managing that? Who is owning the data structure? Is it owned by local operations? Is it owned by some kind of a central governance group? Will that create ITOT type of conflicts? Or convergence, maybe it’s a catalyst to create convergence. also, yeah, that’s a positive side.

Willem (29:05)
Or convergence? Maybe working together? Wait, working together?

David (29:12)
The positive way of thinking. But it is a big challenge. So that means that for me, unified namespace, again, it’s about organizing data. It’s about having this broker slash platform centrally. It’s about making sure that data management is, again, a very important topic, which is part of the discussion. But it is indeed not just one capability.

Willem (29:43)
For me, let’s say I’m an engineer from background. It’s not as if I’ve worked for 20 years in tech and IT and working all the time only with data. I struggled a lot with data management as a concept on its own. mean, what happens if you don’t manage it? Do you have maybe like a way to make it a bit more concrete for those who are struggling with the same thing? They hear the words, but you know…

It’s too virtual.

David (30:13)
Yeah, I would say, let’s be honest, it’s something a lot of people struggle with, lots of organizations struggle with. It’s also, know, is data management data governance or is data governance just part of data management? Do we now have people who are responsible to, I don’t know, to fill in Excel sheets?

Willem (30:37)
I just know that very

often what happens is you have data teams coming in saying, Hey, we’re going to do data management. I give you new responsibilities and afterwards not very much happens usually.

David (30:40)
Yeah.

Yeah.

No. And there is this interesting analogy I’d like to share here, Willem, is so a couple of, well, probably already 10 years or so ago, I was in a very big chemical facility and my contact person over there, he took me to…

literally to a shed. So not just a room, but there was this shed somewhere outside. said, David, I’m going to show you something cool. And so we entered that room and there was this, I would even dare to say, beautiful scale model of the entire industrial or the entire chemical plants, with all the pipes and the reactors.

and the stairs, perfect, like this perfectly crafted scale model. Dusty, have to say, dusty because of the shed, but still perfectly crafted model. And the size of the model was crazy. There was a really, really, really big shed. If you needed to walk around it, it would be probably like a minute or so just to physically walk around.

Willem (31:49)
You

Walk around the thing.

David (32:12)
scale model. The interesting thing here is that when that plant was built probably somewhere in the 60s or so, this was the way they worked. They made a scale model which is, you refer to that as data management, what they did is they made sure that the scale model was a perfect representation of the plant when it was built at that certain moment.

But more importantly, somebody was also responsible to maintain the scale model over time. Because, of course, you would make changes to the installation. Sometimes very small changes. I would say when the plant was just running. Sometimes really big changes during major shutdowns. And you know, at that point in time, the scale model…

was it was basically the centerpiece of this plant for operators to be trained. It’s a bit… The MasterCat file. If you needed to make a change, you could go to the scale model and could figure out if I lift a certain piece of equipment, if I pick it up, it actually fit through the installation or can I drive with my crane?

Willem (33:13)
It was your single source of truth basically. It’s like the master cut file.

David (33:37)
to a certain point, etc. So it was a bit, yeah, it’s an analog twin, basically. Now at a certain point in time, the person who was responsible for maintaining the analog twin probably retired or something like that. And at that point in time, I would say the maintaining part or the keeping it up to date part, that stopped.

So if I or somebody else would go to that scale model today, then we would see something which is totally useless because the point in time where you decide not to update your data model, or in that case, the analog twin, that point in time, it becomes utterly useless because you don’t know whether you can trust it anymore. The trust factor is gone.

And if I compare that to today’s data management initiatives, it’s basically the same. You need people who take ownership on the data. You need to make sure that it always represents the actual state because otherwise people won’t trust it. And then finally, it never stops. It never ends.

Willem (35:00)
It’s not like a one-off, you build your IT solution data platform, you build your model of your plans. It’s also, how do I make sure that that thing keeps up to date so that it remains in value?

David (35:16)
you cannot have two different competing models. You might have different data models for different use cases, but they can’t compete with each other. And it’s also very tricky because how do I make sure that there is this single source of truth? if you start having competing models where model A contains different information than model B,

Then again, you’re starting to have this, I would say at a certain point in time, people will again start losing their trust in what they see. And that’s a big problem.

Willem (35:57)
sticking to that data part because that’s the essence of a data platform. I’m hearing a lot about data ops. How is that different from data management or is just like a fancy new term?

David (36:09)
Yeah, it’s a fancy new term. I also use it from time to time. question. Maybe just like short answer, data ops definitely comes from the data world. We’re now also adapting it or starting to use it in the OT world. Data ops comes from DevOps bringing development and operations people together in IT. Data ops is more about bringing data scientists and data engineers together working on, I would say, yeah, working on a joint problem. It’s that it’s…

Willem (36:11)
Okay, next question.

David (36:38)
We prefer to use the term data platform because it’s a broader term. That’s it.

Willem (36:46)
Now, besides all those points, of course, you’re going to set up your data platform, you want to make sure it works. are the preconditions to start on something like that? Can you start from zero? Do you need to have some basic capabilities? And if you do, which one specifically?

David (37:09)
Yep.

Yeah, absolutely. So you have some supporting capabilities, as we would call it. You know, you need to have a way to deal with cybersecurity. Super important. It needs to happen. Period. Full stop. Yeah, cybersecurity needs to be owned by one organization. There used to be a time also when I was active in cybersecurity, there used to be a time where you had the OT variant and the IT variant.

Willem (37:30)
ITOT both.

David (37:44)
we’re way past that time, you need to have one clear ownership for IT security because hackers unfortunately do not stop at, I would say the boundaries of the silo. So that means cybersecurity needs to be owned by IT full stop. But obviously you need to have OT expertise there as well because yeah, there are still a lot of things very different in the OT world, but the ownership is clearly with the chief security officer, for example.

tightly linked to that is user management. So you want to be able to have federated user management across this application. I think it makes sense, but still. And then really, really, really importantly, and maybe a big change for a lot of typical OT engineers is your lifecycle management. we can learn so, so, so, so much from the people on the IT side, lifecycle management.

It’s a pain in the… You know what I mean? Lifecycle management means easy deployment, easy to deploy, easy to version, easy to monitor whatever I’m doing, easy to make updates. If there needs to be a software update that it’s preferably goes automatically, instead of going to this…

Willem (38:56)
yes.

David (39:09)
tremendously long cycles, but still taking into account some specific things which we need to adhere to in the OT world because yes, there are industries where regulations are extremely strict. Yes, you sometimes need to be working offline if your solution is deployed on a boat or something like that. There are some very…

I would say specific and stringent things we need to account for, but let’s make sure that the way of thinking, of deploying, of versioning, of merging changes and all these types of fun IT things.

Willem (39:48)
I think a

big part there is making, it’s also, it’s a technical thing also, making upgrades easier, smaller, more frequent. But on the other side, I think it’s also a mindset shift. I think within IT, they’re used to, they live through it, an application, a service, it’s constant work. It has a life cycle, needs to constantly be managed. Within operations very often,

David (39:55)
Yep. Yep.

Willem (40:16)
They have more like, I haven’t installed it and I won’t touch it until it breaks or it’s completely run out or run down. And then usually they come to the concept, they find out that it’s actually a lot of work suddenly. It’s complex. They cannot migrate anymore. So it’s also a mindset shift, not only technical.

David (40:30)
Yeah.

Yeah, I 100 % agree that it’s also a feeling of being in control because for some reason, well, not for some reason, I think we’re all raised in the OT, in the operations world, we are raised by this idea when we can physically touch something, when we can see it, when we can, I don’t know, change the Windows registry key ourselves, that we are in control.

But it is so time consuming. And that’s a bit the thing where, and that’s the same with the 70 % time in the beginning, what I said with you, what do you need to clean and integrate, et cetera, Here the same. If you have infrastructure, which is easy to deploy, which is easy to update, which is easy to change, then also from an infrastructure perspective, you can create scaling.

And then maybe one final note, because I can go on for hours and hours and hours on this topic. Sorry for that. But one final note is also a bit the hybrid situation between working really at the edge on-prem. With on-prem, mean a little bit maybe more in a central data center type of thing. And then in the cloud, being it’s in public cloud or in private clouds. And then especially also,

Willem (41:37)
Okay.

David (42:01)
hybrid situations where you can basically combine all these different use cases into one single platform. I think that’s also really important.

Willem (42:10)
Okay, well, I’m going to stop you before you go on for hours before we lose all our listeners. So I’ll call it a wrap. So this was the episode, another episode of the ITOT Insider. We’ve been going through the industrial data platform with you, David. So thanks a lot for sharing your insights and having some explanation on this. I’m looking forward to the series.

David (42:13)
You

Sorry.

Willem (42:39)
based on this platform. So dear listener, if you enjoy this conversation, don’t forget to subscribe to our website, itotinsider.com to this podcast or YouTube channel. And well, we’ll see each other very soon with a series of interviews. Okay, David, thanks a lot and see you soon. Bye.

David (42:39)
Yeah, yeah, absolutely.

Thank you, bye bye.

Industrial DataOps #1 with David & Willem – Deep dive into our capability map, data management and more

We share a passion for IT/OT Convergence

Subscribe to the Blog

Listen to our Podcast

Join our Academy!

About the Authors

David Ariens

Willem Van Lammeren