Join the Insider! Subscribe today to receive our weekly insights
David (00:00)
Welcome, you’re listening to the IT/OT insider podcast. I’m David and I’m joined today by Anupam Gupta. He’s a co-founder and head of strategy at Celebal Technologies. Welcome Anupam.
Anupam Gupta (00:13)
Thanks David.
David (00:15)
So Anupam started as a developer and I’m sure he has many interesting stories to tell us about those years. And then he co-founded Celebal in 2015. They are amongst others SAP Microsoft and Databricks partner. And they use that knowledge to bridge the gap from IT into manufacturing. And that’s super interesting because I would say typically on this podcast, we talk about the other way around. We start from the shop floor.
And we look, I would say we look towards sighting. So very interesting episodes in front of us. And as usual, why don’t you start with your personal background story and that of Celebal.
Anupam Gupta (00:59)
Sure. So David, I did my engineering in India from IIT Bombay, which is one of the premier engineering institutes in India. I studied electrical engineering, which covers electronics and communications as well. I actually happened to be a second generation electrical engineer. My dad is still a practicing engineer.
at this age. And he designs and builds power transformers, power and distribution transformers. So I’ve always had that legacy. I started my professional career as a developer at SAP, started working with SAP Labs in India, had some brief stints in Germany and US as well during my days at SAP, and then moved on to consulting.
David (01:36)
Mm-hmm.
Anupam Gupta (01:50)
work for Deloitte consulting for some time in the Asia Pacific Japan region and then moved on to US. I’ve been in US since 2003 and I co-founded a typical SAP consulting firm. was part of that and before meeting my co-founder, my current co-founder back in 2015 and 2016, we formally started Celebal Technologies.
David (02:20)
Is that you hear that often, right? So people who they have an engineering consulting backgrounds and at a certain point in time, they probably think you probably thought like there is something I can do better or in a different way or was it also your was it also your story?
Anupam Gupta (02:38)
Yeah, absolutely. So when you’re part of building products, you also want to have that customer facing experience. That how do we implement these enterprise solutions for our customers? And that’s a different learning altogether. You get to learn about their business and how they are using these enterprise software and so on. And SAP is such a wide business software system. I call it the brain of the organization.
David (03:07)
Yeah.
Anupam Gupta (03:07)
Right?
Or any large ERP. Of course, SAP leads the pack there globally. And I tell my colleagues and customers that, you can replace every part of your body, but you cannot transplant the brain. Right? So in that sense, ERP is the heart and center of everything.
David (03:22)
you
If we look at it from a control systems perspective, we often say the same for a SCADA or DCS environment, which is the heart and the brain of operations. So what makes an ERP system the heart of an enterprise?
Anupam Gupta (03:51)
You see, SAP started in early 70s. It was the first comprehensive large piece of software that could combine all aspects of an enterprise, right? finance and internal accounting, controlling all the logistics, sales and distribution and inventory aspects, the human resource aspect, then they added CRM later on and everything. And they all talked to each other.
seamlessly, right? So that’s the system for any large organization, particularly which has manufacturing and logistics processes, right? To automate all their processes as well as the core system of record, right? So the only thing that’s outside of ERP in those days is the actual manufacturing, right? Which is the world that you come from of manufacturing execution and then.
David (04:46)
Yeah.
Anupam Gupta (04:49)
everything, but that also kind of relates to your ERP, right? Your plan maintenance and your maintenance orders and everything goes into financial postings and also that was the big win for ERP systems and then you saw the whole mass implementation of ERPs across the globe, right? In almost 20 plus different industries, but it was always heavy.
on the logistics and manufacturing and product costing side. So you would find ERP and SAP more on the manufacturing, all kinds of manufacturing, discrete process. You’ll find them in oil and gas. You’ll find them in energy utilities, power and utilities. And when I say manufacturing, part of CPG and pharmaceuticals is also manufacturing. While we may classify them as separate industries, but they are also core manufacturing for most part.
So ERP is very relevant for those industries as well.
David (05:50)
In a typical stack, I would say we have a control system, we have a manufacturing execution system environment on top of that in some way or form, less or more complex. And then I would say typically we see the MBS system also as the gateway to the ERP environment. This is also what…
I would say what you have seen in your past, are there also MES-alike features available in ERP systems or should they be split?
Anupam Gupta (06:28)
So there have been offerings from the ERP vendors for MES capabilities, et cetera. But I have more often seen specialized MES systems outside of ERP, which would obviously integrate with ERP. It also
deals with the fact that a lot of times you would acquire small subsidiaries or plants and if you’re an oil and gas company, would buy wells and things like that, right? So typically you would have a separate software doing the MES, right? Which you may or may not consolidate, but if there are small ERPs, typically I’ve seen large companies consolidating that into one central ERP or at least the same software, know, be it SAP or something else.
David (07:13)
Now when you co-founded your company about 10 years ago, was stepping into the manufacturing domain, was that already by design or was that more a bit of a logical consequence of the stuff you were doing on the ERP side?
Anupam Gupta (07:32)
So I was doing a lot of SAP consulting and even there my specialty was in some of the data and mobility concepts, right? So bringing data from ERP systems, being SAP and other systems and consolidating that into a single data warehouse, which used to be SAP BW for most part. Occasionally we would have the likes of Teradata, et cetera, also. And one of the gap that I saw in the industry is, you know, there is ERP.
But there is a lot of non-ERP data also that is getting relevant. And that’s the phase around 2015, 2016, where enterprise cloud was getting popular. Although the journey started more from infrastructure side of things, but we were more focused on data and AI from day one. And I met my co-founder, his name is Aniruddh and he’s my co-founder and CEO as well. He came from the data science background. So I came from the traditional enterprise background and then…
David (08:31)
Yeah.
Anupam Gupta (08:32)
We married our skills. knew about data, but I was not so much on the machine learning side and the data science. And that’s the skill and experience he brought in. And we collaborated with that. And there was a gap. So when we talk about Cellebel, we talk about, we are at the right intersection of traditional enterprise and modern cloud innovation. So you have a lot of companies who have deep strength on the traditional enterprise systems like ERP, people who implement ERP.
And then you have a lot of companies who are good at data and AI and cloud and all the automation stuff. We are unique in a way that we bring these two worlds together. We can talk the ERP language, the industry language from energy, utilities, oil and gas, manufacturing, and IoT. But we also know how to use the cloud innovation, how to use a platform like Databricks to marry these two worlds.
David (09:25)
Yeah.
That’s that’s that asks, I would say that triggers an immediate follow up question here. So obviously we talk a lot about data and what is industrial data also on the the on the blog and in other podcasts. We haven’t touched so much the transaction, the transactional sides of the world yet. so maybe what is what is what is you from your perspective, what is unique in
the data you see in manufacturing industry for the zero type of applications compared to other industries, other verticals, or maybe even compared to, I would say the more financial order type of data.
Anupam Gupta (09:59)
you
Great question. So as you probably saw, the data modernization for financial services and retail customers started earlier, and they have been on an accelerated path already. And the key difference between these industries and the core heavy industry or manufacturing is the
dematerialization of data has already happened in financial services or retail. You have the data. If you have an e-commerce website, even for the retailers, the data is already there, most of it. For financial services, the data is already there for most of it because they start from a digital first one, at least for the last couple of decades. So I use this word that the dematerialization of data has already happened.
David (10:39)
Yeah?
Yeah. Yeah.
Anupam Gupta (11:00)
Whereas when you talk about the industry is the manufacturing and energy industry, data is, you have access to some data at a transaction level, right? But consolidation of data, right? Into a central lake house platform like Databricks, it’s just started to happen, right? So the data would be the logged in some of your, know, IoT systems or historians and so on, right?
and then it would be logged in some legacy ERP systems. And you have system that could report on ERP data, but that’s not good enough, right? You want to combine it with the systems that you mentioned, the manufacturing execution systems, different kind of IoT sources, automation, et cetera, right? So you want to marry these two. And then there are other data points like your third party logistics and then different data for demand sensing, et cetera, right? Now, modern cloud-based
data platforms like Databricks gave you that opportunity to consolidate all this data from all these different sources, right? And then get insights on top of it, do automation on top of it, right? There are also a few, I would say, geopolitical or economic factors that add to it, right? I see that most of the manufacturing shift to
either Asia or Latin America or some of the so-called developing countries. And primarily, even for that is cheap labor, low-cost labor. That was the main factor. Now, I see a trend where a lot of this manufacturing will come back to the Western world or to the developed world. If I talk about America, and here’s my thesis on this, you need a few key factors for manufacturing.
You need deep land, you need low cost labor, you need technology in today’s times, you need cheap energy, and you also need access to capital. Now the majority of capital markets is already there in the developed economies, particularly America. And you have all these factors, you have a lot of land, have cheap energy, and there is…
Obviously America is the leader in technology R &D. The factor that is missing is the high cost of labor makes it a challenge to manufacture in America or Western European countries for that matter. Now my theory is if maybe 50, 60 % of this labor gets replaced by automation and robotics, which we see happening already. You have companies like Tesla, then you have a lot of semiconductor manufacturing.
David (13:36)
Yeah.
Anupam Gupta (13:52)
coming back to America, right? Riding on this innovation of robotics automation, right? And data is the central piece of it, if you look at it, right? Which was historically logged in all these systems, right? Now that is coming out, right? Sometimes real time and sometimes, you know, you have batch data, et cetera, combining together to achieve all this automation, right? So these are the trends that are driving innovation.
David (13:58)
Mm-hmm.
Yes.
Anupam Gupta (14:21)
and data modernization in these industries, some of which already happened in other industries. Another industry is the energy industry. Obviously, most of it cannot be outsourced. You have to have your power utilities in the country where you’re consuming the power. You have to have power generation. You also have to have the oil and gas exploration and refinery, et cetera, in these countries. So that will force.
David (14:36)
Yeah.
Anupam Gupta (14:50)
a lot of this automation to save cost on only. And that’s driving this whole data modernization.
David (15:00)
That’s an interesting take in and also about, I would say, these regional differences. That’s very interesting. is maybe there’s another topic which is related to this dematerialization of data is data governance and data management. Something we also talk about quite often. We do that on purpose because I truly, truly, truly believe
that we need to educate the world much more on data governance, data management in manufacturing. The example I always give, and this will definitely resonate with you, is there is not a single ERP system without data management. Because if there is no data management on your ERP layer, then…
everybody would just be, I would say, booking materials and sending invoices and, you know, however they like. However, on the manufacturing, on the shop floor, there is hardly any data management. People just create data points as they like. So how do you see this?
Let’s call it the convergence of how ideas from the IT world and the way manufacturing shop floor systems work. What do you see happening in this domain? Do you see these steps? Do you see systems enabling? Do you see companies changing their way to think about manufacturing data?
Anupam Gupta (16:42)
I see a lot of conversation on this topic. I have not seen systems of unified governance as yet. We are doing some of our own innovation to fill this gap. So yes, one is at a conceptual level. You must have been familiar with this unified namespace concept. Right now, if you have your OT assets,
you have that definition or hierarchy maintained separately in your ERP, and then you have it maintained in your MES systems or shop floor systems. You have it in your historians, and then you have it in your cloud data warehouse. Is it standardized? Probably not. Now, there are a lot of governance systems for IT assets. So today, if I build a data warehouse and Sellebel specializes in building
David (17:23)
Everywhere. No.
Anupam Gupta (17:34)
ERP, Lakehouse and Data Warehouse on cloud, right? Getting data from SAP and other systems, right? We are one of the top partners for Databricks and Microsoft in that space, right? And probably one of the reason Databricks invested in us, right? Which is also very unique for them to invest in a services player, right? There is reasonably good governance for IT assets, right? All my fields, my tables, my views, the definitions, things like that, right?
Sometimes they are not as well synced with ERP. So Cellable has built those capabilities to sync SAP or ERP security credentials or sometimes the glossary, et cetera, into Databricks Unity catalog. What’s missing is the governance for OT assets. For historical reasons, first of all, the OT data was not even moving to most of the consolidated data warehouses. So that is changing now.
David (18:25)
Yeah.
Anupam Gupta (18:32)
So as that data starts coming, what Celebal is building capabilities on top of Databricks, Unity catalog and other systems is to have some kind of governance and categorization and cataloging of those assets as well. Now, as you rightly said, ERP, of course, there is a governance at the transactional level. You log into ERP, of course, there is role-based security, et cetera.
David (18:49)
Yeah.
Anupam Gupta (18:58)
But sometimes it may get lost as you bring the data out. So we are making sure that that thing gets replicated. But also there is similar capability established for all the OT assets. You’re getting data from a set of sensors, from a plan, like who has access to it, who should be looking at it, how do you marry it to the relevant ERP data. All that has to be governed as well.
David (19:05)
Mm-hmm.
Anupam Gupta (19:24)
And I won’t say this is an established practice, but I would say we are moving in that direction.
David (19:33)
Would that mean that would say the actual would that then be the single source of truth? you have like one? It’s not really a watchdog, but at least it’s a it’s a would it be a single source of truth sitting on top of all these different systems?
Anupam Gupta (19:49)
Yeah, actually, that’s the idea, at least from a data insights point of view. You may still log into ERP for different things. But as far as your consolidated lake house is concerned, this could be your single source of good when it comes to getting business insights, maybe driving some automation or your data science machine learning experimentation, building models, and also consuming all kinds of analytics data. Yes.
We hope to see this as the single source. Of course, this is not the system of origin. I’m not saying this is replacing ERP or the OT systems, but this becomes the truth for most of your business users to consume information and to get insights.
David (20:22)
Yeah. Mm-hmm.
Hmm.
Yeah,
but it’s interesting because there is no way we’re gonna replace these systems. It’s as you said in the beginning, you’re not gonna replace the brains of your enterprise and you’re also not gonna replace the brains of your shop floor just because you want to have some kind of an integrated data model. there needs to be sitting, you need to, I would say, work with what you have.
somehow extended. I’ll get back to, I have some other questions about, would say the machine learning stuff you mentioned, but you also mentioned a lake house. So maybe worthwhile to define it a bit more. I think most people, are now aware about a data lake and a data warehouse. But a lake house, while maybe known already in the data world, not yet that known in, I would say in the ITOT.
people who are working in the IT OT domain. So what is a lake house exactly?
Anupam Gupta (21:42)
So see, I think everybody’s familiar with the concept of data warehouse, where you will get data from most of your transactional systems, or definitely from the ERP systems, and then do basic analytics on top of it. Typical data warehouses in the past have lacked capabilities on different varieties of data, or velocity and on the
all the V’s of the world, right? And Data Lake solved that problem, right? That, you have an agile data store that lets you dump data from all different sources without, on day one, forcing you the governance and discipline that is more governed by IT, right? So it gave a lot of flexibility. At the cost of that, hey, data is not easily usable. And people started using the words like data swamp and things like that, right?
Of course, the performance was lacking, so you can’t really run high performance analytics and so on. But you could use it as a general purpose data platform to get data and dump it from all different sources that anybody in the organization can consume and do it in a distributed manner and a relatively low cost because you’re doing it on commodity hardware and things like that. And now you’re doing it on cloud as well. The concept of Lakehouse was very interesting. It combined these two worlds.
David (22:58)
Yep.
Anupam Gupta (23:05)
So you would still have a sort of schema and we need kind of capabilities where to simplify and without a lot of processing of data with the help of IT could still start accessing the data in a more agile fashion. So it combined best of both worlds. And then the data lends itself for analytics and machine learning. So it also has capabilities of all the building the models, deploying all the MLOps capabilities.
in addition to data ops, et cetera. And now that you can get the streaming data as well, the OT data, sometimes unstructured data, right? You come from the manufacturing world, so there are so many of these safety manuals and other unstructured documents, contracts, et cetera, that are dumb, but you can never get inside some of it, right? So all that can be stored in a lake house. And now you have these wonderful GNI capabilities to,
David (23:57)
Mm-hmm.
Anupam Gupta (24:05)
mine that wealth of information and combine all these data points.
David (24:09)
I was going to ask the Gen.AI question, but you beat me to it. what are Gen.AI applications you are seeing getting, I would say, developed, piloted right now?
Anupam Gupta (24:25)
You see, since we are on the JNI topic, I know there is a lot of hype and there are genuine use cases. The couple of use cases that the early phase of JNI was serving is, have chatbots that you can query, you have platforms within Databricks like Gini, et cetera, but also doing things like document mining. And we have been doing these use cases earlier as well, because we have access to
all the business glossary and we have insights what ERP data means. So obviously we are in a good position to convert that into a business semantic layer and get these questions answered on top of that. But these use cases were already being done. Of course with JNI, the capability and the intelligence of data mining and then asking these questions, the ability to convert an English question into a SQL statement, they have gone much better.
David (25:23)
Mm-hmm.
Anupam Gupta (25:24)
So that’s one part of it. Of course, there are innovations in vision AI, which have been enhanced by some GenAI techniques. Now the concept of agents is the next wave of GenAI, where we are looking for hyper automation, where you may or may not have a human in the loop, but you could have complex processes, like a 20-step RPA process, which we used to have. You could have much more intelligence.
within that process, right? I see a lot of interest and those are the kind of use cases that we are doing. I also see some resistance of taking the human out of the loop, right? As large enterprises, we are not comfortable having machines take actions on our behalf, right? So maybe it’s a combination of, you know, having this automation and having some humans in the loop as well. So you would get the notification out, it’s up to you to approve or sort of reject.
David (26:05)
Yeah.
Anupam Gupta (26:23)
a request, right? So let’s say you did the analysis, okay, do I need to approve this purchase recognition or not? Right? After all the whole AI process, agent-tik AI process, but then you make a decision, right? That, okay, I’m the one making the final call.
David (26:23)
Mm-hmm. Mm-hmm.
Isn’t that also linked to trust and results, especially for humans, is typically linked to having an understanding on what has happened. So if there is a certain problem and there is a black box and you get a certain answer, then humans have the tendency
to go like, yeah, maybe it is right, maybe for this one time, but I don’t know what is happening in the black box, so I don’t trust it. And then you also have, of course, the industry’s fair validation is a point, that’s yet something else, how do you, especially for these agents, because then as far as I can, or my understanding is, an agent has, I would say, a specific task,
the or smaller, I would say contained task. And I would say they talk to each other. So they progress from agent to agent. How do you create trust? how do you make sure that people like the ones you just mentioned go like, yeah, this is, we can take the human out of the loop here.
Anupam Gupta (28:03)
See, are getting better, right? And even if it’s outside of AI agents, if you look at large organizations, Sometimes an individual is not empowered to take the final call.
David (28:18)
Mm-hmm. Yeah.
Anupam Gupta (28:18)
It gets to the manager and the manager. So there are
two points to it. One, you mentioned the black box nature of AI, but I think that’s not so much a factor. It’s the concern of losing control and the concern that AI may not have factored a certain scenario. So we are not comfortable taking human out of the loop completely. It may or may not happen.
David (28:34)
Mm-hmm.
Yep.
Mm-hmm.
Anupam Gupta (28:45)
in future. It may happen in the future is what I feel. It’s like a self-driving car where it takes time for you to take the human out of the loop. We have seen the journey of fully autonomous cars. And there are those stages of partial autonomy or full autonomous. So I think we are in that stage. Other thing, I’m a bit critical about applied
David (28:47)
Mm-hmm.
Yeah.
Anupam Gupta (29:14)
JNI is sometimes tend to convert structured data into unstructured data just to apply JNI. I find that stupid. Because see, structured data or any kind of machine job, it’s already accurate. It’s already pretty accurate at a transaction level. One other thing I joke about is,
JNI is about getting intelligent humans to build smart models so that they can talk to dumb humans. Because if you look at everything, which is structured, it’s already driven by some API integration and things like that. It’s not much abstract. Now, where it applies very well is, of course, when you have this text mining or when you have a conversational scenario where you just want the comfort of
David (29:56)
Yeah.
Anupam Gupta (30:12)
talking in English and executing a command. Obviously, JNI plays a very, I’m talking about the large language model part of JNI. And of course, there are much better applications in the consumer space, right? But yeah, it does help in automation, simplification, because enterprise users also don’t want to talk to too many systems, right? So if you put that,
David (30:37)
Mm-hmm. Mm-hmm. Yeah.
Anupam Gupta (30:39)
LLM layer on top of it and the agentic framework on top of it, right? You could abstract a lot of that, right? Let’s say there are multiple ERPs, there are multiple CRM, logistic systems, whatever, right? And if you have an interface that talks to all of this and abstracts it for the user, right? Just like search, when consumer search came in the form of Google and all these tools, right? That abstracted this whole, you know, so I call it like the transition from Yahoo directory
David (31:02)
Yeah.
Anupam Gupta (31:08)
to Google. I don’t know if you are that old to remember Yahoo! directories, right? Where you are going through this step by step until you get to what you’re looking for, right? As opposed to just asking a question, right?
David (31:10)
Yes, yes, yes, yes.
Yeah.
I
even, but this, was still, was very young. I even was the, like the moderator or admin for one of these directory pages. If you, if you now, if you think about that with the world we’re living in right now. So I was, I don’t know. I think I was the moderator of a page called Internets or something like that, or developer or something like that. So I listed the websites where you could find.
developer resources. It’s like totally unimaginable that these types of directories, they would still exist today.
Anupam Gupta (32:01)
Yeah, so now you have this search that you can bypass all those instructions. And my hope is, and it’s a loose analogy, that these agentic frameworks would bypass the complexity of all these multiple transactional systems, multiple steps. And if you’re familiar with ERP, there are certain processes where there is a 20-step process before you get to doing something. Can that be automated? It was partially being automated.
David (32:06)
Yeah.
Anupam Gupta (32:30)
in RPA now that automation gets more powerful. It can also abstract certain jobs which may not be super structured. For example, if part of it involves you to go and check the profile of a potential customer before kind of approving a credit or something, that’s where agent-tk-ai really adds value.
David (32:52)
Is there, we’re getting close to the end of this episode, but one thing I’d like to ask is, so we talked about regional differences. We didn’t touch the, I would say the difference between smaller companies and multinationals. One of the complaints I hear quite often is people from a smaller company going like, yeah, but…
They have 25 data scientists and 20 data engineers. of course they can build such a thing. What do you think should happen to make these types of technologies available for everybody?
Anupam Gupta (33:36)
That’s a great question. See, and as it is, we are seeing some democratization of this whole data scientist role, right? So you have citizen data scientists, right?
you have to take it in the right context, right? If you’re trying to use certain common language models, right, which is kind of abstracting a SQL, asking a question and et cetera, right? Those things are getting more and more automated. Like you see how Genie is available as part of Databricks Lakehouse and so on, right? But yeah, if you’re looking to build more custom complex models in the context of your organization, there is some knowledge and expertise needed, right?
That’s where it may be challenge for an organization that doesn’t have lot of technology expertise in-house. And that’s where partners like us sometimes come in. We work with mid-size companies as well as large enterprises. So yeah, absolutely.
David (34:40)
Select
the right partner and I’m probably also really focused on domain knowledge. I think even the mid-sized companies, they have, I would say, if they combine the right domain knowledge with the right partner, then still magical things can happen.
Anupam Gupta (34:55)
Absolutely. Absolutely.
David (34:57)
Alright, alright, super. Thank you so much for joining me. I think this is a perfect time to wrap up this episode of the IT/OT insider podcast. We also have an episode, a separate episode with with Databricks. So I also advise our listeners if you if you haven’t done that to also look that up on your on your favorite channel. Yeah, thank you for joining me.
Anupam Gupta (35:24)
Thanks for having me and I look forward to seeing you at the Hannover Messe event.
David (35:29)
Absolutely, absolutely.
And also to our listeners, thank you for tuning in again. If you enjoyed the conversation, don’t forget to subscribe at itotinsider.com, leave us a rating and see you next time when we bring you more insights on bridging IT and OT. Until then, take care. Bye bye.