Transcript: Storage as the Data Foundation for Enterprise AI | Garima Kapoor, Co-founder & Co-CEO of MinIO
In this episode of The Startup Project, host Nataraj Sindam interviews Garima Kapoor, Co-founder and Co-CEO of MinIO. They discuss MinIO's founding thesis that the bulk of data would be produced outside public cloud, how the company grew virally through open source, why storage is now the biggest bottleneck for GPU utilization in AI workloads, how MinIO mandated AI tools across its engineering team, and why Garima believes the value in software has permanently shifted from code to customers and market access.
2026-05-05
Host: Hello everyone, welcome to Startup Project. Today my guest is Garima Kapoor. She's the co-founder and co-CEO of MinIO. I think a lot of people listening to this might not be familiar with MinIO if they're not a developer or building applications with either object storage or unstructured storage in general, but people who are familiar with S3 protocol or Amazon S3 or Azure Blob, most of the folks who might be familiar with MinIO.
But this is going to be an interesting conversation about the infrastructure layer, about storage, about how storage is being used in AI. So hopefully it will give you some value out of it. Before that, make sure that you subscribe to Startup Project on YouTube or wherever you're listening to it. With that, Garima, welcome to the show.
Guest: Thank you for having me, Nataraj.
Host: So I want to talk about, you know, MinIO, how it started because a lot of people, you know, don't go and start a competitive product to Amazon and Azure, just like that. So some thought, you know, might have gone into, you know, we should start an S3 storage product. So what was that initial sort of motivation for that?
Guest: Ha, boy. You know, founding a company is never easy. And one of the things is that it really needs to come from a deeply personal place if you're starting something from scratch. So for me personally, I wanted to do a startup. And my background is not an engineering background. So you can imagine in terms of, you know, coming into the tech area overall.
That's one of the good things of being in Bay Area, that you're always plugged into technology. So you're never out of it. So we started the company in end of 2014. And it just started with a simple idea that we knew that data is something that is going to grow exponentially. And that's the problem that we needed to go solve for. And at a very, very high level, when it comes to data, you can only do two things out of it. You can either store it at a massive scale, or you can derive insights on it from compute standpoint. So for us, we knew that AWS had convinced the world that S3 object storage was the right platform to bring all your unstructured data or semi-structured data at scale to, but there was nothing outside of AWS that was really catering to the demand. And we knew that bulk of the data is going to be produced outside of public cloud. So AWS started with bring all the data to us and we were like, no, we will go wherever the data is produced and we will give them the cloud environment that they need, what AWS was giving them. So that was the foundation and starting point for MinIO. It was just based on simple reasoning that data is going to be, bulk of the data is going to be produced outside of public cloud and there's going to be a requirement that we need to be closer to data.
Host: So today, where do you stand? Is it most of the business on cloud or on premises? What does the business look like today?
Guest: So we are software defined from technology standpoint. You can deploy us wherever you need to be, within public clouds, within new clouds, within private cloud environments, edge cloud environments. It's a single static binary that you can download on the underlying infrastructure. So technology is simple. Now from customer perspective, where they find our value is when they deploy us on massive scale of data. And that is usually in a hybrid cloud environment way where they've built massive data lakes or massive AI scale infrastructure and they're building or they're bringing all their applications on top of us. So that is where they usually get most value out of us. And that's where bulk of our large scale deployments are also there.
Host: Does MinIO also play in migrating on premises to cloud or bringing cloud, on premises workloads to cloud?
Guest: So data movement is, I'll just take a couple of steps back. As the infrastructure is getting more composable, microservices based environment, you need to keep everything super portable from application standpoint. And our belief has always been that the underlying infrastructure needs to look the same, regardless where you are, regardless your public cloud, regardless your private cloud. Because if you see why public cloud became most successful, it was because enterprise IT never looked like public cloud, right? And they were just stuck in the legacy world and developers needed to build and iterate their application faster. And cloud gave them that environment when it came to microservices and quick iteration in terms of building their applications as well as the quick access to the resources. So that's where the productivity gain was so high. That's where public cloud really took off.
But as you know, private cloud, bulk of the data is still outside of public cloud, right? And as you know, the enterprises are evolving — they need to make their private cloud environments look closer to public cloud environments and make sure the applications remain portable no matter where they persist at the end of the day. That is the end goal that they are working towards. So data migration is an inherent part of it. What we recommend our customers is, do not — the data has gravity. It is extremely hard to move petabytes and petabytes of data from one place to another. You lose a lot of money there. It's more important to make your applications portable and build them the right way so that they can be wherever they need to be, whether on private cloud or on public cloud. That's why we really promote industry standard and open formats so that applications are not stuck anywhere. Because if you free the applications, then you also free yourself from the data movement problem overall.
Host: Can you talk a little bit about that initial journey of finding product-market fit? Who were the initial customers that sort of adopted MinIO and how did that happen?
Guest: So we started with open source and we grew virally in the developer community, whoever wanted to build S3 compatible applications. So as you already know, Object Store has a very horizontal use case when it comes to deployment. And that's how public cloud is also built. The foundation is Object Store and then all the application ecosystem gets built around it.
So that's how even from MinIO use perspective when it came, we were seeing ourselves in multiple use cases, from edge to private to public cloud. Developers were just doing multiple things in terms of building and deploying applications. So for us, it was easier to find the product-market fit just because we were an S3 compatible object store. That's how we started. From customer journey standpoint, it was a learning in terms of finding where our sweet spot really lies. Because in enterprise world, object store has always been the cheap and deep kind of a tier — not how public cloud treats that. Public cloud always treated object store as a first-class citizen, whereas in enterprise world, the mindset was different. So there was a journey that enterprises had taken in terms of really realizing that object store is the primary storage for building your high performance application workloads. So we had to be patient in that journey with the enterprises as well as they realized where MinIO can really help them become modern and more cloud native in their architecture and their infrastructure. So that I feel was a journey, but in terms of the developers and adoption fit, it was just right at the go because there was just so much demand for S3 compatible object storage overall from application building standpoint.
Host: How are you acquiring customers today? Like, is it more like a top-down sales team approach or like, is it because you still have a lot of open source and a lot of developer mind share on the bottom-up side? Like, how does a typical customer acquisition look like today?
Guest: Yeah, it's a mix because we had been open source for the longest time. So we do have that brand recognition and that credibility in the market. So developers love our product and they use our product all day long. So a couple of years back, we started investing in our outbound go-to-market strategy. That's where we're building the true enterprise sales team. And so we are trying to bridge both the bottom-up and the top-down approach. I would say you need to do both the things. You cannot just rely on one motion because you as a company, you need to do everything that you can do to accelerate the sales. So that's what we are doing.
Host: I mean, S3 I think primarily as a protocol started by Amazon. And then, you know, Amazon and Microsoft — Microsoft is pretty well known for having large sales ecosystems, right? How do you compete in a space where these three big giants take all the mind share, right? Because they're not just pitching object storage, they're pitching different databases, platform services, AI services, everything, right? How do you compete with these three big giants?
Guest: Yeah, you know, it's very interesting because the market is much more bigger if you see, and the bulk of the part of the market that is the fastest growing is on the storage side. And if you double click on that, another piece which is fast growing is on the private cloud environment side of things. So I would not say it is one or the other. Any enterprise that you talk to right now, they are looking at different clouds together. So it's not like, you know, if you have a customer, they only have AWS or they only have Azure. They have environments — multiple environments, AWS, Azure, as well as their private cloud environment. FSIs are a great example of that. So it really comes from the requirement of what the customer is trying to accomplish. What we always tell is that AWS, Azure, Google Cloud, they're great for burst workloads. Go give your developers those wings. But if you have those persistent workloads that are going to require scale, that are going to require constant gets and puts, that is where the cost really starts adding up. And that's where you need to make the conscious decisions about where those workloads need to be. And I think as long as the CIOs have that clarity, it's very easy to decide which workload will persist in which environment. And like I said, as long as there is thoughtfulness in terms of making sure applications are portable across different environments, and standardization on industry standard APIs and open formats, the IT teams should be having that leverage to be wherever they want.
Host: How did AI change your point of view of what's going to happen with storage as a business? Like how did, once the whole AI and ChatGPT movement happen, what happened with storage as a market?
Guest: I think we couldn't have asked for better timing of AI. One of the most important things is just the extent of the data that is getting generated and the extent of demand that we are seeing to get these training and inferencing workloads off the ground. And I think that has been a huge accelerant. And if you've been following even NVIDIA, they have been talking that since 2022, a lot of the focus was on the compute nodes, how many GPUs are getting sold, where the investments are happening on the NeoCloud front. But now, this year, they have made an announcement that the focus needs to come to storage because data is the biggest bottleneck to enable the GPU and the compute nodes to operate fully. So that is where now bulk of the investment is coming back on the storage standpoint — how to make the storage best suited for AI.
MinIO is in a unique position because of the time we started. We don't have any kind of legacy code base. We don't have the legacy baggage on our shoulders. So we are able to move a lot more faster. And because we have built in a cloud-first way and AI was born in cloud, it just becomes a natural fit for all the training and inferencing workloads for us. And AI is all about unstructured data, and we couldn't have asked for a better time.
Host: And like, how's the usage penetration been from customers? I think there's different types of customers, right? There's like the big clouds or like the big labs which are doing inferencing and training. And then there's a whole category of inference companies which are hosting the models. And then there are the customers already in your object store who now realize they have all this unstructured data they can gain intelligence out of. Can you talk about that latter part — how are customers using the existing data that they have?
Guest: Yeah, I think that's a great question because you really need to view the customers from a little bit of a different lens. One is coming from major labs and NeoClouds, because they are doing inferencing at scale, training at scale is happening. So those are a little bit unique use cases because the scale is also very different from what you would see in enterprise space. Now, coming to enterprise customers, I think there is a lot of excitement in terms of adoption, but also there is confusion because enterprise AI applications are not ready yet. And that is where, you know, in the next wave of AI, that's what I think the focus is shifting towards — how to get the enterprise AI applications right so that they can really get to the value of it.
What we are seeing from customer standpoint is that enterprises usually have had bulk of structured data or log data. That's where the bulk of the data is concentrated. And that's where even back in the day when big data was born, that's where the bulk of the concentration of data actually came through. Now, from big data standpoint, they've transitioned over to modern data lakes, everything coming on disaggregation from object to application standpoint. And then now they are planning to use those same environments to store the models, store the checkpoints of the model, and enable the agents to autonomously come on top of the same data lake or hybrid lake infrastructure. So that's the journey that we are seeing from customer standpoint.
Host: How are NeoClouds changing the ecosystem? Because I think one phenomenon that I was seeing within storage is that a lot of NeoClouds are adopting VastData, right? For different purposes and different storage. I think there's some sort of reinvention happening. Because the three clouds have a way of doing things and companies like yours have been there for a while doing things in a certain format. Are NeoClouds just reinventing what was done before, or what is your experience with what NeoClouds are actually doing?
Guest: So NeoClouds are specific GPU clouds, if I might say. And if you see, there is also market making that has happened, which is what you would do if you were someone like NVIDIA and you had so much money. You would be investing in forward-looking things. And that is where a lot of the things from NeoClouds came into being. But at the end of the day, these are GPU clouds, and they're very specifically built for inferencing workloads. So that is what is going on in the market. Like I said, bulk of the focus till now had been on the compute side. And a lot of these NVIDIA reference architectures just get blindly adopted with these NeoClouds.
NVIDIA started with file systems first, and that is where those reference architectures got adopted — even with the new clouds, that file systems is where they need to have their storage environments. But as the scale of the data has increased so much, and there is a lot more education — whether you see OpenAI or Anthropic, they're all built on object first. Even NVIDIA recognizes that they need to go object first. And that's where even this year, all the announcements that you would have seen at GDC, they were all on object store for their STX Blueprint. STX Blueprint is — think of it as equivalent to DGX. DGX was for compute. STX is for storage. So that is where now NVIDIA is also consolidating in terms of their reference architectures for object store to power these GPUs at the maximum throughput, because object stores are the ones that can scale, and they do not have the complexity of file systems. And more importantly, most of the applications are built for object first, rather than retrofitting file systems. So that's what we are seeing — new clouds are just beginning to adopt those reference architectures as NVIDIA is starting to promote them. It's coming from NVIDIA; they are educating the new cloud market overall on what are the right architectures to adopt.
Host: Yeah, the investment in AI, I think there are different numbers depending on how you look at it. I think data center investment doubled. How much of that is translating into storage demand, like specifically year-over-year object storage demand? Like is that 50% year-over-year? What are you seeing in terms of how much that is translating down to the storage layer?
Guest: I think you have to see that the market is still in infancy because applications are not mature, right? We are only now beginning to talk about physical AI. And think about it — a year from now, we have Optimus released and that changes the game altogether, right? And then Figure AI releases their stuff. Things are moving so fast. I would still say that all these numbers are still very miniscule to what you're going to see in the next couple of years from now. Because think about everything. Right now, your phone is an AI device. Your car, if you drive Tesla, that's a compute device sitting right there. But that is going to be everything around you — everything is going to be translated into an AI device at the end of the day. And think about the extent of the data that is going to be generated. And all that data has to be stored. And all that data has to be processed. And that is where you will start seeing really massive numbers grow. So right now we are still in the infancy.
Host: AI coding has become a thing. When I was a developer a couple of years ago, I always used to work with a senior engineer. It was like, if you have a good relationship with a senior engineer, that's the easiest way to learn things. And you always had this senior most engineer in a team who you were really afraid of asking too many questions in a day. But now everyone has 10 senior engineers supporting them and that's sort of like really changed the game for what developers can do. My question is — did that change how fast you deliver products? Did it improve the pace of delivery? What is the reality versus hype? Because you're running a company, so how much of that is translating in your product delivery?
Guest: It is phenomenal. I mean, the pace has grown so much. One of the things that we have told everyone in the engineering team is that they have to use Claude. They have to use Anthropic. Use whatever AI tool that you need to, whether it is Codex or Anthropic, whatever helps you get the work done. But you need to use AI in your daily workflow. There are no excuses for that. And once we got over that hump — like engineers signing up and being open to using AI, because a lot of senior engineers are also of the mindset "I can do this better than AI" when it was early on. But I think now we are way past that phase, especially thanks to the last two, three months, how fast things have evolved and how much better they have gotten. So I think everyone sees that value of it. And we are seeing it firsthand. It's unbelievable the pace of innovation that can happen.
And it goes both ways, right? From customer support, because we are a deeply technical product, right? And we are only a software-defined solution. So we interact with underlying hardware, we interact with the application. So the bugs can come from anywhere and we are throwing the error. So from the customer standpoint, we need to help them solve the problems. So Claude has made life so easy, not only in terms of debugging, but also the real code that gets into the product itself. Now, I do believe like 50% of the people are mostly reviewing the code that Claude is producing. We have no handwritten code. Go with AI, help translate that, review that code, and then push it forward. So that's what our guidance is. And we are seeing phenomenal results. It's unbelievable, unbelievable pace.
Host: Are there any other areas in running the company where AI has drastically changed things? Coding is an obvious one from an engineering perspective, but are there any interesting solutions or products that you created internally, deploying the power of AI to run the organization better?
Guest: No, it's across the board. Of course, engineering is the most obvious one, as you said, from coding perspective. But even if you see the kind of dashboards that can be created to just manage your data — whether it is your marketing data or whether it is managing your Salesforce environment — it's quite amazing how much work can get accomplished in such a small time. So we have our own dashboards built out, now very customized to our own marketing data. That's how we are running the teams now. And that goes across the board — sales, marketing, product, everywhere.
Host: Does it reduce your company's dependence on other SaaS tools? Because there's obviously this talk about, we no longer need some of the SaaS tools. I think SaaS is a very broad category. Even MinIO is SaaS, but Workday is a SaaS, and Monday.com is a SaaS, but they require really different levels of engineering depth. I would call MinIO or Azure Blob these are like infrastructure SaaS. It's harder to build for one person or even 10 people to build it. And it's hard to even maintain it even if you build it.
Guest: 100%, 100%, 100%. I mean, infrastructure is, of course, a little bit unique. And storage is even more unique because nobody wants to mess with the data or lose their data and so on and so forth. Extremely critical part of your stack. So the AI disruption from that standpoint, I don't think we will see it in the short term when it comes to data stores. But definitely when it comes to SaaS. And like you mentioned, SaaS is a much broader thing, right? You have HR tools, you have finance tools, they're all categorized as SaaS, and they're highly critical. I don't think we are going to move away from Oracle or NetSuite anytime soon. But that is also interesting — there will be new age startups that are going to get born to disrupt them.
So it's all about how much these existing players can now move faster because they have this two-year or three-year time window that they need to go and invest back and become AI-first in their thinking, as compared to someone else coming in the market. Because now the value is no longer in the code. The value has shifted back to people, back to your market access and customers, as simple as that.
It is quite interesting because something that used to take like 10 years to build — an enterprise product usually takes like a 10-year time frame to build and get production ready and enterprise ready. Now that cycle has shrunk so much, you can get a product out in a year's time frame or even lesser depending on what you're trying to solve. So the value has moved away from code for sure. And it's all about the customers at the end of the day and the value that you're delivering.
From software and SaaS companies' point of view, I think they really need to think about what they can do to invest back. If you are holding critical data, how can you hold it further down and build more around it rather than just not take any action? Because right now, it's quite extraordinary times that we are living in.
Host: When you're planning for the next couple of quarters or looking at your balance sheet, is the cost of SaaS not still a conversation? Or is it starting to become a conversation where you prioritize build versus buy, and maybe you're getting high ROI versus low ROI?
Guest: See, as a startup, I think we always have to question everything that we are doing internally to keep the processes efficient. So regardless of whether AI would have happened or not happened, as a startup, you really need to question. One of the things for startups is cash is king. So it's customer money in and then whatever you are spending. If you have cash in the bank, nobody can come and touch you. And startups really tend to run into trouble when their burn gets up high and the revenue is not that high. So that's why I always tell, keep an eye on cash always. And as a startup, you always need to be mindful that you are doing things in the most efficient manner as possible. And AI has, of course, given us now wings to do so much more and question every single thing to the foundational level. So of course, we are questioning every single tool that we have. Can we do it in-house? Can we do it outside? And once we go through that process, we make a decision.
Host: There's this interesting phrase I think Jensen said, like, a $500K engineer should spend $250K on tokens.
Guest: Yeah, you must be able to spend money on tokens.
Host: I mean, there are lots of funny memes on that. You know, your drug dealer wants you to buy more drugs, lots of interesting funny memes. To me, it looked like at some point we will start evaluating per employee, how many tokens are we using versus how much we deliver out of it. Like, do you see people using a crazy number of tokens? On a broad scale, are they using a lot or medium for their development work? What is your leadership point of view?
Guest: I keep hearing on Slack, "we're running out of tokens, we're running out of tokens." So that's real. You have to understand, it's always the opportunity cost. That's how you have to evaluate everything, every decision that you make, because there's an opportunity cost for it. Now, instead of hiring three engineers, if you're just putting — I'll take your example — $250K on tokens for one engineer, you're still saving money. The cost of managing two more engineers, everything associated with it, onboarding, and AI just does the work that you ask it to do. People have their own minds. So in terms of management, it becomes a lot more easier. So I always think in terms of the opportunity cost. Even if it becomes $250K, $500K, it's still cheaper to have AI agents run for you than to have people.
Host: Not good news for people.
Guest: It's interesting times and I always tell that it is not AI versus human. It's about which humans will be able to leverage AI to the maximum capability. And that's where the difference is really going to come. The entry-level people, if they're not curious, if they don't take initiative themselves, if they don't have those basic qualities that make people unique, then it will be a hard time for them.
Host: What should an early software engineer or even an early employee in any category spend time learning? You can learn a lot about AI, but what I learned about AI four months ago and what happened in the last two months is very drastically different.
Guest: Yeah, it is. That's why I think you need to go back to the basics. You need to understand how things really work. You cannot fake your way in terms of this high-level thinking anymore because AI is going to throw so much information at you. And if you don't know how to distinguish right from wrong and what is hallucination, you will be so lost. Even if you have to do a code review of what Claude is throwing at you, you really need to understand what you're looking for. And to get to that understanding, you need to really get into the details of it. So for anyone, entry-level people, I think the biggest thing is just get into the details of every single thing. Get back to the fundamentals. Understand everything from the ground up. Because then you will be able to run — the initial time that you invest in learning those foundational things is going to help you run faster, much faster with AI, as compared to the other way around. If you don't have those foundational things clear, whether you're in marketing, whether you're in sales, wherever you are — AI is just a tool to help you get to your end destination faster. But you need to know where you're going. You need to have that clarity at the end of the day.
Host: I think one thing that also changed is that the process of getting those foundations is short-circuited now. The analogy I gave is like you work with your senior engineers to understand what it's like to operate at a senior level, right? But it's sort of like short-circuited this process. Now there's no incentive to spend a lot of time, neither does the senior engineer have time to do this mentoring. The problems that engineers face are completely different now. So it's sort of like short-circuited the process in a way that it's not easy for an early engineer to get the same skills if they spend two years. They might still deliver more work, but it's harder to encounter a problem that actually solidifies your fundamentals in some way.
Guest: I think there'll be newer problems. There are always problems.
Host: Yeah, there'll be new problems, so the fundamental aspect of what "fundamentals" even means is actually changing, right, in some form. So, where is MinIO going next? What do the next couple of years look like for MinIO as we wrap up the conversation?
Guest: Yeah, no, we are just heads down in terms of building more capability. The market, as I said, is growing at an exponential pace and it is so uniquely suited for MinIO to grab the maximum market share out there. So we are just heads down in terms of our product innovation and execution so that we don't miss this chance, because this is a once-in-a-lifetime opportunity, especially for startups. Because on one side you have legacy players which have just been bolting on additional code on their legacy and trying to be relevant. And this is actually the time — if I have to go back and give an analogy — this is the time when mammals come on the planet and dinosaurs go away. So I think it's such a unique, such a wonderful opportunity. It's all about execution and innovation and we are heads down on that one.
Host: That's a great analogy. I think the Shopify CEO said something like this — like in 2026, every business is now up for grabs. You can start brand new and go take a legacy company pretty fast. So pretty much every business is up for grabs in 2026.
Guest: So true, so true. I'm his fan. He's one of the most prolific CEOs of our times. And I really admire all the work that he has done.
Host: I think that's a good note to end the conversation on. Thanks very much. Thanks for coming on the show and sharing all about MinIO.
Guest: Thank you for having me and it was such a pleasure to speak with you, Nataraj.