Startup Project

Category: Podcast Episode Transcript

Full transcripts of the Startup Project podcasts.

Joseph Krause on Radical AI & the Future of Materials Discovery

AI is rapidly transforming every industry, and deep tech is no exception. In this episode, we sit down with Joseph Krause, co-founder and CEO of Radical AI, a company poised to revolutionize how we discover and develop new materials. Radical AI is building a self-driving lab where AI autonomously designs, tests, and discovers materials, accelerating R&D at an unprecedented scale. This groundbreaking approach has attracted a historic $55M seed round from major investors like NVIDIA and Raytheon. Joseph, a US Army National Guard veteran with a PhD in materials science, shares his unique journey from military service and academia to deep tech investing and finally, entrepreneurship. We dive into Radical AI’s “materials flywheel” concept, how they are increasing the speed of discovery by 370x, and why New York City is the perfect place to build a world-changing company.

→ Enjoy this conversation with Joseph Krause, on Spotify, Apple, or YouTube.
→ Subscribe to our newsletter and never miss an update.

Nataraj: You have a diverse background from materials science, working in the Army National Guard, and a little bit of venture capital at AlleyCorp, who I’m assuming is also an investor in Radical AI. How did these different experiences converge and lead to starting Radical AI?

Joseph Krause: Great question. I always love the Steve Jobs quote about how you can never connect the dots looking forward, only looking backward. For myself, I very much feel this is true today. Going back to when I was in graduate school, I was pursuing a materials science PhD at Rice University. I was working on a lot of what we call functional materials, or materials that can be used in real applications, and I was frustrated with my lack of ability to push things into industry. That actually isn’t poor performance from academia; it’s just not what academia is truly focused on. Academia’s job is to drive our fundamental understanding of science. For me, I wanted to take that fundamental understanding, work on applied applications and materials, and really put them into products in the world.

At a similar time as I was going through my PhD, I was separately serving in the US Army National Guard. It’s a part-time military service where you can get called up into active duty if needed. I’ve always been a big fan of the military and wanted to serve, so it was a very important thing for me. I had this opportunity come up from the Army Research Lab, which is the US Army’s corporate development research lab, where they were doing materials research but thinking about problems for the Army and how these things can impact the future warfighter. I got really interested in doing science that I care about the impact of. I’m separately serving in the National Guard and I care about the warfighter having the best technology. So I took a journeyman fellowship at Army Research Lab, working on neuromorphic computing.

Even here, it’s still fundamental because it’s at the core, cutting edge of research. I still felt this lack of ability to execute on translation and pushing things out of research into the market. So I took a leave of absence from my PhD, joined AlleyCorp, which is a VC based here in New York City, and I started investing in material science, semiconductors, 3D printing, and informatics to understand how to commercialize materials. While at AlleyCorp, me and one of my other co-founders, Jorge Colindres, came up with this inception for Radical AI. We found our third co-founder, Gerbrand Ceder, and we rolled out and started the company from there. Looking back, it’s a very linear trajectory, but at the time, I had no idea what the next step was.

Nataraj: Looking back, what were the traditional ways of doing things that were preventing this commercialization of new materials or discoveries?

Joseph Krause: I think two things. One is focus. The primary purpose of academic research is not to commercialize applications. It can be an outcome, which is why there are tech transfer offices, but it’s not the core driver. The driver is discovering new fundamental phenomena.

Number two is that the commercialization of science typically ends up in a product. If you think about semiconductor transistor technology, the materials research behind that goes into a chip which is then put in advanced packaging so we can use it in a smartphone or a computer. If you work on transistor technology, you’re still a couple of steps removed from the product that will actually utilize those materials. That disconnect is hard to bridge. Because you don’t have this focus and you’re not building the end product, you can’t really solve commercialized problems. The ones that do, like nuclear fusion, take that scientific invention and start using it in a real product, like a reactor. Bridging that gap, in my opinion, really requires private enterprise. That’s why you’ve seen companies from fusion to space with SpaceX and Blue Origin building private companies on really advanced scientific capabilities.

Nataraj: When you have a good founding team and some funds, what’s the next step? If it’s an Uber for something, I would go and create a new app. What was the product that Radical AI created first?

Joseph Krause: The most important thing to think about is technology, and then from that technology, what is the product you develop? We create new materials, and we want to commercialize those materials by actually selling them at scale. The area that we are working in is very much in aerospace, defense, and energy, where we use these structural metals to build components. Our product, the output of our technology, is novel materials that enable new applications.

The technology piece is how we do that. I love using SpaceX as an example. They focus on bringing things to orbit and eventually getting to Mars. The way they do that is with advanced rocket capabilities that are reusable and more cost-effective. That technology is the means of how they get things into space. If our end goal is to develop incredible materials that enable new applications, the way we do that is our materials flywheel. This is where our AI modeling, engine, and fully robotic self-driving lab come in. They allow us to do materials discovery 370 times faster than a human scientist, bring in different data sets like patents or scientific literature to make new hypotheses, and build a high-throughput capability where we test tens of thousands of materials per year, not just a few. When you bring all those together, you have this materials flywheel that allows you to very quickly discover novel materials and then go into applications that will directly use them for new advancements.

Nataraj: Who are the typical customers looking for new materials? And secondly, you mentioned a 370x speed advantage. Where in the flywheel does that come from?

Joseph Krause: Great questions. First, who buys materials? My favorite question to answer, because it’s every single company on earth. It doesn’t matter if you are in aerospace, automotive, manufacturing, defense, climate, energy, semiconductors, or even athletic apparel. Every single industry in the world is a direct result of novel material advancement. But all of them feel this problem of very long timelines—typically 10 to 20 years—and incredible cost, north of a hundred million for a single material system. So you arguably have one of the largest markets in the world, we just cannot execute on it fast enough.

On the second piece, digital research is challenging in materials, and we cannot only do digital-based research. So much of the know-how in making a material at scale is not just how to make it, but how to make it in large quantities with the properties you need—what we call the processing of a material. Most of the IP of the biggest material companies is trade secrets around processing. So if you never make it in a lab, see its properties for an application, and understand the conditions to scale it—temperature, pressure, oxidation, concentration—then you truly can’t capture the value. We do a lot on the digital side with machine learning and generative technology for inverse design, but at the end of the day, we still need to make that material in a lab. This gap is the hardest thing to understand and where all the value sits in the materials industry today.

Nataraj: How do you pick which direction to pursue? Are you only going in directions where you have pre-commitments from customers, or do you have your own ideas of what new material to find?

Joseph Krause: Great question, because focus is very important for an early-stage startup. It’s actually both. The classes of materials we work in are directly driven by customers. We do not want to randomly work on materials with no identified customer problem. That’s the first thing. The second reason it’s both is that it’s up to us to come up with the new material to solve those problems. A perfect example is our work in the hypersonic space. Hypersonic missiles have a problem where the alloys they use today cannot keep up with the composites inside. The problem is how to drive higher heat and better mechanical performance at high temperatures. That problem is from the end customer, but the solution we’ve come up with is a high entropy alloy. We are designing novel high entropy alloys for this application. We take the problem from an end customer, and then we think about what material system can solve that problem in a way that creates a huge market opportunity.

There’s a third option, which is something like a room-temperature superconductor. The industries that would benefit are endless. There’s not really a market for that today, with customers saying, “I want to solve this in the next six months.” But if you had one, every company would be interested. We call those enabling technologies. However, focus is key. Radical AI needs to show the world that we have the best scientific discovery flywheel and can drive real commercial value. Once we do that, then we can focus on more enabling things like a room-temperature superconductor.

Nataraj: A couple of years back, there was a viral paper about reproducing a superconductor at room temperature. Are you talking about a similar thing?

Joseph Krause: Yes, LK-99 specifically. And that’s a perfect example of where experimental validation of a material is very important. This paper came out claiming room-temperature superconducting ability. People ran simulations to try to confirm it, and some thought they had. But when it came to reproducibility and experimentation, we could not replicate that material. It’s a perfect example of where experimental work is equally as important as simulation work.

Nataraj: What type of people are you hiring? And what type of AI are you using? Are you building your own models?

Joseph Krause: The people we hire span five technology buckets: machine learning and AI, software engineering, automation/robotics, mechanical engineering, and materials science. Building an interdisciplinary team is imperative. On the AI front, we use a multitude of different AI technologies in a multimodal approach. We have an agentic system that sits at the top layer using LLM technology. We don’t build LLMs from scratch; we fine-tune models from the big AI labs. Those agents have access to different tools we’ve built, like graph neural nets and generative-based models using diffusion for inverse design. We also use older AI like computer vision in our lab to track experiments and Bayesian optimization in our design of experiments loop. It’s a mix of technology across the board, from cutting-edge LLMs to well-understood architectures.

Nataraj: DeepMind has AlphaFold for predicting protein structures. Is there a similar opportunity in materials for predicting crystal structures?

Joseph Krause: Absolutely, that’s a very real thing and it’s being deeply explored. We do that internally with our own models. Big Tech also has models: Microsoft has MatterGen, Google has GNoME, and Meta has models from their Open Catalyst Project. The field is growing rapidly. There’s been some pushback that some predictions are not novel or not valid—meaning we can’t actually make them in the lab. These constraints of novelty and validity are really important. A big problem with these models is a lack of experimental data; they are trained almost exclusively on computational data. You need the ground truth from the experimental lab. The problem is that 90% of the work a scientist does doesn’t work, and we don’t capture that negative data. It lives in a scientist’s head or lab notebook. That’s why we build self-driving labs. We capture every single data point and feed that experimental data back into our AI engine to build the most proprietary dataset in the world.

Nataraj: Why pick New York?

Joseph Krause: New York is one of the capitals of the world. We love it for two reasons. One, the talent density is very high and continues to grow. There are not many elite performers who would not want to move to New York City. Two, if you want to work on advanced scientific discovery and deep technology, there are not many places to do that in New York. We might be one of the only places where you can do AI and science. If you want to be in New York and work on really hard problems, Radical is one of the few places you can go. Those two reasons are linked, and that’s why New York is super important to us.

Nataraj: What is your best AI use case for you personally or for managing your company?

Joseph Krause: For me personally, it’s research and knowledge gain. I might talk with a customer who has a very specific materials problem in their parts, for example, in the automotive industry. I’m a material scientist, not an automotive expert. I will use AI to take what would be a three or four-month learning process and get up to speed very quickly on the history of materials used in that application, the research areas pursued, and the current state of the art. I can use AI to get knowledgeable enough in an hour. It’s for longer, extended research settings where I need quick, actionable information that I can learn today and use this afternoon.

Nataraj: And are you using ChatGPT, Deep Research, or something else?

Joseph Krause: To be honest, I use all of them because I’m still in the discovery phase of seeing who’s the best. I’ve used a lot of Grok from xAI; for science, they seem very good and pull references well. I use ChatGPT a lot for idea generation. I will also use Gemini in direct comparison. Sometimes I’ll have both open in a tab, paste the same prompt in both, and see what different information I can pull. I don’t have a single winner yet.

Nataraj: Every time I see Apple’s new demo, they talk about redesigning every material. Where does that material research happen?

Joseph Krause: Apple is an incredible materials science company. A majority of their advancements come from new materials. The battery life, the colors, the aluminum and titanium they use—it’s all a materials science problem, down to the semiconductor technology. They do research internally; they have a very good materials research team. They also work with outside parties. Corning with Gorilla Glass is the one most people know. That was a material discovered 20 years before it had an application, until Steve Jobs came knocking and needed a scratch-resistant glass for a touchscreen. Apple is very good at indexing information quickly and pushing material advancements.

Joseph Krause, co-founder and CEO of Radical AI, joins the Startup Project to discuss how his company is revolutionizing materials R&D. He shares how integrating AI, robotics, and engineering into a “materials flywheel” accelerates discovery by 370x, attracting a historic $55M seed round from investors like NVIDIA and Raytheon. Joseph details the journey from his PhD and military service to building a world-changing deep tech company in New York City.

Joseph’s vision for Radical AI highlights the immense potential of integrating AI and robotics into fundamental scientific research. By creating a closed-loop, autonomous system for materials discovery, his team is not just building a company—they are building a platform to solve some of the world’s most critical challenges across aerospace, defense, and energy.

→ If you enjoyed this conversation with Joseph Krause, listen to the full episode here on Spotify, Apple, or YouTube.
→ Subscribe to ourNewsletter and never miss an update.

January 4, 2026
Statsig Founder Vijaye Raji on Building a Data-Driven Platform

Introduction

After a decade at Microsoft and another at Facebook, where he served as VP and Head of Entertainment, Vijaye Raji took the leap from big tech executive to startup founder. In 2021, he launched Statsig, an all-in-one product development platform designed to empower teams with experimentation, feature management, and product analytics. Built on the principles he learned scaling products for billions of users, Statsig helps companies like OpenAI, Notion, and Whatnot make data-informed decisions and accelerate growth.

In this conversation with Nataraj, Vijaye shares his journey, the tactical lessons learned in hiring and scaling, and the cultural shifts required when transitioning from a corporate giant to a lean startup. He dives deep into how modern product teams are leveraging rapid iteration and experimentation, and offers his perspective on what the future of product development looks like in an AI-first world.

→ Enjoy this conversation with Vijaye Raje, on Spotify, Apple, or YouTube.

→ Subscribe to ournewsletter and never miss an update.

Lightly Edited Transcript

Nataraj: As I was researching this conversation, I found out that when you were considering leaving Facebook, Mark Zuckerberg tried to convince you to stay at the company. What was that moment like?

Vijaye Raji: This was not the first time. I tried to leave the company a couple of times before then, and every single time it was a conversation that convinced me to stay. There’s a lot to be done here, a startup is a small company, the impact you will have is not that big. For all the good reasons, I stayed back at Facebook. When it was 10 years, I knew something new had to be done for my own personal sake. I felt like I needed the startup in my life, so I left.

Nataraj: You’ve been in big tech companies for almost two decades at that point. What was the personal motivation? There’s always a personal calculus. I’m doing all this work for a company, can I own more equity? What was your thinking at that point?

Vijaye Raji: I started at Microsoft and spent about 10 years there as a software engineer. To be completely transparent, I loved Microsoft. I enjoyed it and learned a lot. Everything that I know about software engineering, Microsoft is the best place. This was back in the early 2000s, where you learn how to build software and predict something that’s going to happen two years down the line. It’s like a science, and there’s so much to learn from so many good people, so I had a lot of good time learning all of that stuff. At some point, I had thought about building something different. This is probably something that is very common nowadays, where you’re in a holding pattern for your green card. You can’t really leave or reset your green card, so you don’t really explore other options when you’re in that situation. I had to be like that for a little while. Once I got my green card, the first thing I did was look around, and luckily for me, Facebook was starting up an office in Seattle. That was my first jump from what I thought was a really good learning experience for a whole decade. I went into Facebook at that time, and Facebook was a startup. It was a late-stage startup, not quite ready for IPO, so I thought I was joining a very small company. Leaving behind a company that was 100,000 people to join a company that was only 1,000 or 1,200 people at that time was incredibly different and a good learning experience. I thought I was learning a lot at Microsoft, and then I went to Facebook. There was a completely new world. That’s how I went from one big company to what I thought was a startup, and then eventually, you know the story of Facebook. It grew so fast, and by the time I was there for years, it had grown to 65,000 people or something. That was a lot of good learning because when you’re in a company that is growing that fast, you learn a lot and you get exposed to a lot.

Nataraj: By the time you left, you were leading entertainment at Facebook and also leading the Facebook Seattle office.

Vijaye Raji: Yeah, one of the things that I generally do is every couple of years or so, I try something completely new. Even at Microsoft, I started with Microsoft TV, which was a set-top box, and then moved on to developer divisions doing Visual Studio, building compiler services. After that, I was working on SQL Server, building databases, and then Windows operating systems. Even within Microsoft, I did various little things. Then at Facebook, I started out as an engineer and worked on Messenger and some ads products, and then I worked on Marketplace, and then gaming and entertainment. Each one of them is pretty different. They don’t have much correlation or continuation, and that’s how I’ve always operated in my career. When I left, I was the head of entertainment, which included everything from videos, music, and movies, and also the head of Seattle, which when I joined was about a couple dozen people. When I left, we had about 6,500 people spread across 19 different buildings.

Nataraj: What were some of the interesting problems that you were working on as head of entertainment, and what was the scale of those problems?

Vijaye Raji: As Head of Entertainment, if you think about Facebook’s apps, there’s a social aspect to it—your friends and your community—and then there’s an entertainment aspect, which is you just want to spend time and be entertained. The kinds of stuff you do for entertainment could be watching videos, listening to music, watching music videos, playing games, or watching other people play games. You watch short clips from TV shows and so on. Another huge area is podcasts. Anything that is not pertaining to your social circle belongs to this entertainment category, and that was my purview. The problems we were trying to solve were about how to make the time people spend of high quality. What do they gain out of it, and how do they get high-quality entertainment? That includes everything from acquiring the right kind of content, understanding what people want, and then personalizing the content to them. It also includes removing content that is not great for the platform, anything violating policy. So you invest quite heavily in the integrity of the platform as well. On the engineering side, scale is a very important problem. When you’re delivering video at 4K, high quality, high bit rate to networks that may not be reliable, you have interesting engineering problems that you have to go solve. Those are all super exciting.

Nataraj: Were you primarily focused on the technology of getting the entertainment on the different Facebook platforms or also part of dealing with the business side of it, like licensing and acquiring content?

Vijaye Raji: It was part of that too. When you have a product that is observing what people watch, you know what people want. You then want to go and buy more of that content. We had a media procurement team, and you could go to them and say, this is the kind of content that people consume on Facebook, so let’s go get more of those. That plays into the decision of where the company would go invest.

Nataraj: So you were doing some exciting stuff at Facebook at scale and then you decided it’s time for you to leave and start your own company. Did you evaluate a different set of ideas, or was the idea for Statsig brewing in your mind while you were at Facebook?

Vijaye Raji: It’s a little bit of both. The first part of the journey was deciding to go start a company. The second part was, what do I go build? Deciding to start a company had been brewing for a long time. It was one of those things that I would regret if I didn’t do it. As for what to go build, because of my varied experience doing everything from gaming to ads to marketplaces to videos, I had lots of ideas. When you’re evaluating an idea, you want to take into account what the market size could be, what the propensity of a buyer is to pay a dollar for you, and what you are good at. Sometimes you’re going against a lot of competitors, so what are we really good at? And what could I bring that could be an advantage? Those are all the factors that go into it. If you think about it, your passion is driven by your heart, but this logical analysis is driven by your mind. If you’re entirely driven by passion, you may build something that may not be sellable. Those were the kinds of considerations that went into deciding to go build a developer platform that includes everything from decision-making and empowering everyone to make the right decisions using data.

Nataraj: So, once you decided on this particular experimentation developer platform, how did you go about getting those first couple of customers?

Vijaye Raji: It’s a good journey and a good lesson for everyone building startups. Usually, when you have a founder with an immense amount of faith and conviction that this is what I’m going to build, you are very mission-driven. While you’re building, you’re talking to a lot of people. This is the part where I made all kinds of mistakes. You go to someone you know who is willing to spend 30 minutes with you and say, ‘I’m going to build this developer platform, it’s going to be pretty awesome, it’s going to have all kinds of features.’ What are they going to say? Chances are, they’re going to say, ‘That sounds like a great idea, you should go do it.’ You talk to enough people, and you build this echo chamber where you are now even more convinced that everybody needs this platform. Then you go build it in a vacuum. We did this for about six months. At the end of six months, we went to the people I talked to before and asked if they were willing to start using this product. And you know what? You go talk to them and they say, ‘Let me think about it.’ ‘Let me think about it’ means they’re not really that interested. It’s much harder to have them integrate it into their existing product, and much harder to have them pay a single penny. You learn that lesson. This is one of those things where I was talking to one of my co-founders at that time, and a person said, ‘You’ve got to go read this book, The Mom Test.’ I went and read it and realized all the mistakes I was making when talking to customers. The point is, first, you need to understand what problems people are facing and if you have a solution for that problem. To even get to that stage, you need to know who your ideal customer profile is. Then you talk to them and make sure the product you’re selling actually solves the problem. Not only that, you have to be the industry best for somebody to even care about your product and then open up their wallet. Those are the kind of hard lessons that I learned over the course of the next few months.

Nataraj: What was the value proposition of Statsig at that point, and why was it different from what already existed in the market?

Vijaye Raji: The value proposition has not changed since the day we started. It has always been the same: the more you know about your product, the better decisions you’re going to make. What we’re doing is empowering product builders, whether you’re an engineer, data scientist, or product manager. The idea is to observe how people use your product, what they care about, and where they spend more time. All of those are important for you to know how your product is doing. Number two is what features are not working as intended. And number three is using those two insights to know what to go build next. That’s literally what we sell as the value from day one. The differentiation from existing products is that previous products were all point solutions. For feature flagging, you need a separate product. For analytics, a separate product. For experimentation, a separate product. We’re bringing them all together. The benefits are that it consolidates all data into one place, so you don’t have to fragment your data. Number two, because you’re not fragmenting data, it all becomes the source of truth. And number three, it opens up some really interesting scenarios. If you combine flagging with analytics, you can get impact analysis for every single feature. That’s something you can’t do if you have two different products.

Nataraj: Can you explain flagging for those who might not be aware of feature flags?

Vijaye Raji: Feature flags are ways for you to control where to launch, how to launch, and to whom. You can decouple code from features so that when you want to launch new code, you can send it to your app store and get it reviewed. Once it’s live, you can turn on features completely decoupled from code releases. It’s extremely powerful. Number one, it’s powerful to know when to launch your product. Number two, when something goes wrong in real-time, you can turn it off. There are lots of other reasons, like doing rollouts at scale. You don’t turn on a feature for 100% of users everywhere; you can do 2%, 4%, 8% just to know that all the metrics you care about are still sound.

Nataraj: Are there any specific trends across big and small companies on how they’re approaching experimentation or product validation?

Vijaye Raji: The trend that we are betting on is that more product development is going to be data-driven. That’s the reason why we’re here, and what we’re doing in the industry is accelerating that trend. The education we do for prospects and the industry is basically catalyzing and accelerating this. Product development used to be this siloed thing where a product manager comes up with an idea, engineers code it, testers test it, you ship it, and then you wait for two years for another release cycle. Now it’s a very iterative process, and people ship weekly, daily, and sometimes even hourly. To get to that level of speed, you need controls in place. To allow people to make distributed decisions, you need the ability to know how each line of code you wrote is performing. These tools are getting more adoption day by day, and people moving from the traditional way of development to this modern way all need it. Our bet is that AI is going to accelerate that movement because you’re going to have lots of rich software built from blocks that you’ve just assembled. You need to know if your hypothesis or your original idea actually turned out to be the product. You need these observability tools built into your product to be able to know. Generally, this trend is moving towards data-driven development.

Nataraj: What is the right time for a company to adopt Statsig? What is the ideal customer profile for you?

Vijaye Raji: You should start on day one. Every feature that you’re building… I remember the early days of us building software. The first thing we launched was our website, and I’m refreshing that page all the time. Whenever somebody visited the website, I was looking at them, seeing the session replay. I was literally spending all my time figuring out how people were using the product. That is an important step in the journey of your company. So start on day one. Integrate with feature flags, product analytics, session replay, and all of that stuff that will give you insights into how users are using your product. Eventually, you’ll get to the place where you want to run experiments. You don’t have to do experiments on day one. When you get there, you have the same tool with all the same historic data now capable of running experiments. There’s no early time; you just start right away.

Nataraj: I’ve used different experimentation tools, and one of the negative side effects I see is this idea that we’ll just A/B test everything. It can lead to a little bit of incrementalism and experimentation fatigue. Do you have a take on when to use experimentation versus when to use your gut and your product sense?

Vijaye Raji: You remember the famously misattributed quote from Henry Ford that said if I had asked people what they needed, they would have said faster horses, not a car. Experimentation is not a replacement for product intuition. You’re not going to get rid of product intuition. To make those leaps from a faster horse to a car, you cannot experiment your way there. You need people with lots of good intuition and drive to get to that kind of leap. But then once you had your first version of the car, from the Model T to where we are now, there have been thousands, if not millions, of improvements made. Those are all things you can experiment with to find better versions of what you currently have. My philosophy is that product intuition and experimentation go hand in hand. Some of these things, you have product intuition, you have conviction, you go do it. But when you’re about to launch it, just make sure that there are no side effects—things that you have not thought about. Products have gotten so complex nowadays, I don’t think anybody out there can meaningfully understand ChatGPT in its entirety. When you’re in that kind of situation, it’s only going to get harder for any one person to fully grok your product. In those cases, observability, instrumentation, and analytics are extremely valuable. Then you have experiments. They don’t have to be just testing two different variants. It could literally be, ‘I believe in this feature. I’m going to launch this feature.’ That is a hypothesis. Validate it. Put it out there and measure how much it’s actually improved in all of those dimensions that you thought about.

Nataraj: Let’s talk a little bit about growth. You started in 2021. Can you give a sense of the scale of the company right now?

Vijaye Raji: We have about a couple thousand users, most of them come self-serve. They pick up our product, we have all kinds of open-source SDKs. We have a few hundred enterprise customers that are using our product at scale. And we have massive scale in terms of data. If you think about the few hundred enterprise customers, they all have big data. We process about four petabytes of data every day, which is mind-boggling. Last September, we announced we were processing about a trillion events every single day. Now we’re processing about 1.6 trillion events every single day. To put that in perspective, that’s about 18 million events every second. That’s what our system is processing. That’s been our growth in terms of customers, scale, and infrastructure.

Nataraj: How are you positioning Statsig? Is it primarily a developer tool, and you’re using that to drive enterprise growth?

Vijaye Raji: We position ourselves as a product development platform. It caters to engineers, product managers, marketers, and data scientists. There are parts of the product for each individual. If someone comes to us to solve an experimentation problem, it’s usually a data scientist team. But once we’re in that company, our product can grow organically. We don’t charge for seats. The engineering team will adopt Statsig for feature flagging, and the product team will adopt it for analytics and visualization. This organic growth happens within a company, and this is how we have grown even within our existing customer base.

Nataraj: Where do you spend most of your marketing efforts for the highest ROI?

Vijaye Raji: There are different outcomes for our marketing efforts. Some are direct response, where we feed people into our website for self-serve sign-ups or talking to our sales team. We track that, and it’s a very seasonal thing. Then there are aspects of marketing that are more brand-related. We want to be out there and build brand awareness. We invest in things like that. One of the fun things we did last year was a billboard in San Francisco. That was really cool because we got a lot of brand awareness from that. We also partner with people like you and some podcasts that we work with who have great reach and brand awareness.

Nataraj: In terms of culture, you mentioned you’re a completely in-person company. Why does that matter?

Vijaye Raji: Our product teams are all in-person. Our sales team is spread out in the US, with some in the Bay Area, some in New York, and a couple of people in London. But everyone else—engineering, product, data science, design, marketing—we’re all in-person in Bellevue. It started out with eight of us on day one. There were so many decisions we had to make, all this whiteboarding. It would be very hard to do all of that over Zoom. We naturally gravitated towards doing this in person, and I saw how fast we were able to move by not having to document all the decisions. The clock speed was extremely high, and I was reluctant to ever lose that. It’s a trade-off. We’ve had so many really good people we had to say no to because they were not willing to do this in person. That’s a painful trade-off. But four years in, we’re still in person. We’re about 100 people showing up five days a week, and it’s a self-selection mechanism. There are a lot of positives. We’ve built a really good community, and I like it and want to keep going as long as I can.

Nataraj: You were in Seattle in 2021 when it was hard to hire talent. How were you figuring out how to hire great people?

Vijaye Raji: That’s a very good question. You want to hire great people and retain great people. After managing large orgs, the realization I came to is that it’s not the intensity that matters, but what proportion of the time you spend doing things you don’t enjoy. If you’re doing intense work but you love what you’re doing, you have autonomy, and no overhead or friction, people love that. They come in excited, leave exhausted, but they are gratified. As long as you can provide that environment, the intensity can be high, and you don’t have to worry about burnout. To me, it’s always been about how I can remove friction, overhead, and process. These are creative people; I want them to be in their element. Can I provide the best working environment for them? In a startup, if you’re doing a 10-to-5 job, it’s not going to work. People that come into Statsig are already self-selecting into our culture. We’re not doing anything crazy like six days a week, but we are a hardworking group of people, and I like to keep the talent density extremely high because it affects the people that are already here.

Nataraj: Let’s talk a little bit about AI. How are AI companies using Statsig?

Vijaye Raji: Absolutely. If you’re a consumer of LLMs, we have an SDK that you can use to consume these APIs, where we will automatically track your latency, your token length, and how your product is doing. We tie it all back to the combination of the prompt, the model, and the input parameters you used, quantifying the lift or regression. We also have prompt experiments in the product. There will be a lot of people building different kinds of prompts and wanting to validate how each one is performing. We have a very specialized experimentation product just for prompt experiments. The rest of it is just a very powerful platform you can use for anything. If you’re OpenAI running experiments on ChatGPT, that’s going through Statsig. Or if you’re Notion, a consumer of AI and LLMs, you pass it through Statsig. Statsig powers you to determine which models work, which combination of all the parameters yields better results. Then there’s how Statsig uses AI to make our customers’ experience better. There’s a lot we’re doing there that I’ll be announcing in the next few months.

Nataraj: In terms of Statsig, do you have a favorite failure or a deal that fell through that changed things?

Vijaye Raji: A lot. But one thing I want to call out is the humbling experience when you realize you will never be the first one to come up with the best ideas. Part of it is learning to acknowledge that’s a good idea, give them credit, and then quickly follow on. If it is a great idea and you believe it, without any ego, just go and build it. If you can build it better than anybody else, then you continue to live on for a couple more days. We famously didn’t take data warehouses seriously in the first year or so. We built everything on the cloud without really taking into consideration warehouses like Snowflake or Databricks. Then we started seeing customers come in with things like, ‘Hey, I have Snowflake. Could you operate on that?’ And we would be like, ‘Well, you can ingest data from there, but I can’t operate on top of it.’ You start to believe in your current products. Then you realize they start to leave, saying Statsig is not the right product for us. After a couple of those humbling moments, you realize your position is not right. So we spent the next three to four months building a warehouse-native product, and then we came back to the industry and started selling our product. That was a very good failure, realization, and then action from the realization.

Nataraj: What are you consuming right now? It can be anything—books, movies, or TV series.

Vijaye Raji: I’m a big Audible guy, so I listen to books on my way to work and back. Right now, it’s Brian Greene’s ‘The Elegant Universe.’ I think this is the second time I’m reading it. I just wanted to listen to it all over again, and every time I feel like I catch on to something new from that book.

Nataraj: What do you know about founding a startup that you wish you knew when you started?

Vijaye Raji: Thousands of things. I wish I had spent more time with my sales and marketing friends back at Facebook. They’re all good friends, and I’m still in touch with all of them. We used to sit in meetings every week, but I never once thought to drill down deeper. How is your team structured, how are they incentivized, what kind of commissions do they get, how do you think about the different types of marketing? I wish I had learned all of that stuff so I could have saved a lot of failures in the early days.

Nataraj: For you, I have a special question: what is a big company perk that you miss?

Vijaye Raji: The recruiting team.

Nataraj: What is it that you don’t miss?

Vijaye Raji: A lot of things. In a big company, you’re sitting in review after review after review. Those are not just product reviews; you’re looking at privacy reviews and security reviews, things that are important but end up being overhead. At startups, you can move extremely fast by bypassing a lot of that, or even if you have to take care of them, they are much smaller deals.

Conclusion

Vijaye Raji’s journey from scaling products at tech giants to building Statsig from the ground up offers a masterclass in modern product development. His insights underscore the power of combining deep product intuition with rigorous, data-driven validation to build products that customers love. For any founder or product leader, this conversation is a valuable guide to navigating the complexities of hiring, scaling, and maintaining velocity.

→ If you enjoyed this conversation with Vijaye Raje, listen to the full episode here on Spotify, Apple, or YouTube.

→ Subscribe to ourNewsletter and never miss an update.

November 26, 2025
Molham Aref on Building RelationalAI: An AI Coprocessor for Snowflake

In this episode of The Startup Project, host Nataraj sits down with Molham Aref, CEO of RelationalAI. With over three decades of experience in enterprise AI and machine learning, Molham shares his journey from pioneering neural networks in the ’90s to founding his latest venture. RelationalAI is tackling a fundamental challenge for modern enterprises: the sheer complexity of building intelligent applications. By creating an AI coprocessor for data clouds like Snowflake, RelationalAI unifies disparate analytics stacks—from predictive and prescriptive analytics to rule-based reasoning and graph analytics—into a single, cohesive platform. Molham discusses the evolution of the modern data stack, the practical applications of GenAI in the enterprise, and offers hard-won advice on founder-led sales for B2B startups. This conversation is a masterclass in building a deep-tech company that solves real-world business problems.

→ Enjoy this conversation with Molham Aref, on Spotify, Apple, or YouTube.
→ Subscribe to ournewsletter and never miss an update.

Nataraj: My guest today is Molham, the CEO of Relational AI. He was the CEO of Logicbox and PredictX before this. Relational AI recently raised a Series B of $75 million from Tiger Global, Madrona, and Menlo Ventures. They’re widely known for solving use cases like rule-based reasoning and predictive analysis for large enterprise customers. So this episode will try to focus on what the modern data stack looks like, what applications can be built, and a lot more interesting things around those topics. With that, Molham, welcome to the show.

Molham Aref: Thanks, Nataraj. Pleasure to be here. Looking forward to the discussion.

Nataraj: I think a good place to start would be to talk a little bit about your career and all the things that you’ve done before this. How did you end up with RelationalAI?

Molham Aref: Great. I have been doing machine learning and AI-type stuff in the enterprise under various labels since the early 90s, so it’s over 30 years now. I started out working on computer vision problems at AT&T as a young engineer coming out of Georgia Tech and worked there for a couple of years. Then I joined a company that was commercializing neural networks, a company called HNC Software, and worked on some of the early neural network systems, particularly in the area of credit card fraud detection. I focused on retail, supply chain, and demand prediction. I was very fortunate to have a wonderful experience there learning about the technology and all the things you have to do when you put together what today we would call an intelligent application. We had a very nice IPO in 1995. We bought a small company called Retek that was working specifically in the retail industry and learned a lot from that experience. We grew Retek quite substantially and spun it out in another IPO in 1999. You start thinking, this is easy. Anyone can do this.

So in the early 2000s, I started getting involved in startups myself. One was also trying to apply computer vision technology to a brick-and-mortar environment, a company called Brickstream. Unfortunately, Brickstream was a good idea too early. I learned a little bit about how too early is indistinguishable from wrong in the startup game.

Nataraj: Was it similar to what Amazon Go stores was like? What were you trying to do?

Molham Aref: Not nearly that sophisticated, but yes, the idea is that you can put stereo cameras in the ceiling of a retail environment—a retail bank, a retail store—and start collecting information about what consumers are doing, where they’re dwelling, what products they look at, within certain error tolerances, and then connecting the whole customer journey because you would get handed off from camera to camera as you walk through the store. Anonymously, we didn’t know who people were; we were just looking at the top of their heads. You build a picture of what the brick-and-mortar experience is like. At the time, this was before deep learning, and everything was handcrafted computer vision algorithms. It worked, but it was expensive. A lot of the challenge was what do you do with that data? So we weren’t just solving a problem around computer vision; we had to justify the data. It was a good experience, but not a good commercial outcome.

Then I helped start a company in the wireless network space where we built network simulation optimization systems for AT&T, Cingular, and American Mobile, a bunch of wireless operators. We helped them migrate from the 2G systems they were on to 2.5G and 3G systems and helped them manage infrastructure and spectrum and a whole bunch of other stuff that you couldn’t deploy without the benefit of very sophisticated intelligence.

Nataraj: How did you go from machine learning and vision to wireless networks? How did that idea come up?

Molham Aref: At the core, a lot of what we do in machine learning and AI is about modeling. We have deployed a handful of modeling techniques: simulation, machine learning, GenAI more recently, rule-based reasoning, mathematical optimization, things like integer programming, and graph analytics. The AI toolbox has half a dozen tools that you deploy in a variety of contexts, and it’s perfectly normal for the domain to vary. Whether you’re modeling a wireless network or a retail supply chain, there are entities that are involved. In a retail supply chain, you might have raw material in the form of fabric. In the wireless industry, you have raw material in the form of spectrum. In the retail industry, you might change that fabric through manufacturing to make it into a t-shirt. In the wireless industry, you take that raw spectrum and, using Ericsson and Nokia equipment, you convert it into a wireless minute or a wireless kilobyte. Then you manage supply and demand accordingly. So in both contexts, you have to predict how many customers are going to want this wireless minute or this t-shirt. You try not to make too much of your product because if you make too much, it’s perishable. It loses value.

I’m not a wireless expert, but you learn enough about the core concepts in that domain. That company did reasonably well and was acquired by Ericsson. After that, I went back into retail and helped build one of the first companies that built retail solutions on the cloud. We went all in on the cloud in 2008-2009, when that wasn’t an obvious choice, and leveraged the cloud to do a new class of machine learning. My whole career was spent working at companies focused on building one or two intelligent applications, and in every situation, it was a mess. You had to combine different technology stacks: the operational stack, with the BI stack, with the planning stack, the prescriptive analytics stack, with the predictive analytics stack. You end up spending a lot of time and energy just gluing it all together. That’s fundamentally the reason building intelligent applications is hard. I thought it would be cool to build a generic technology on a popular platform to make it so that you don’t need so many components and so much duct tape.

Nataraj: It would help for the listeners if you can talk a little bit more about what predictive analytics, prescriptive analytics, or rule-based reasoning mean.

Molham Aref: Broadly speaking, you have descriptive analytics. It answers the question of what happened in my business. Business intelligence is a form of that. You’re looking backward and saying, what were the sales of flat-screen televisions in Boston last year? You can aggregate the data by region, by different time granularity, by different types of products. If you just have descriptive analytics, it’s up to the human to look at that and then project forward as to what you’ll sell in January in Boston or Philadelphia. There’s a ton of data to look at, and you can improve that with a variety of modeling techniques, everything from time series forecasting to today’s graph neural networks to help you understand what drives demand. If you can now predict what the demand is going to be in January or February next year under various promotional and pricing scenarios, you can now leverage prescriptive analytic technology. Descriptive, predictive. Prescriptive is, I know the relationship between, say, price and demand. What should I set the price to to maximize revenue or profit or market share? The technology you use for each of these tasks is different. GenAI, of course, is a very powerful new technique that we have in our toolbox. But at its core, it’s predictive because it’s trying to predict the next word in a sentence.

Nataraj: You figured out it’s very hard to deploy these solutions and you started RelationalAI in 2017, before GenAI. What were the primary use cases and types of customers you were targeting at that point, and how did it evolve?

Molham Aref: My area of expertise and our team’s area of expertise is in the enterprise, deploying all these different techniques, including rule-based reasoning, which is symbolic reasoning. The idea was to take all of these techniques because for any hard problem in the enterprise, you can deploy all of them to solve it. We help build applications today that have elements of GenAI, graph neural networks, integer programming, graph analytics, and rule-based reasoning. I strongly believe that the combination is what wins. My view is AI and machine learning, in particular, drive us towards data-centricity because the datasets involved are big. The old architectures that use the database in a dumb way and then pull data out to put in a Java program stop working when you have to pull a terabyte of data out. We wanted to move all the semantics, all the business logic, all of the model of your business as close to the data as possible. We built a system that we take to market as an extension to data clouds like Snowflake. We call it a co-processor. We plug in inside Snowflake and build a relational knowledge graph that facilitates queries that do graph analytics, rule-based reasoning, predictive analytics, and prescriptive analytics. It’s all in one place: your data, your SQL, your Python, and all of these capabilities. We see a 10 to 20x reduction in complexity and code size.

Nataraj: What forced you to build on top of, or as you call it, a co-processor on Snowflake? There are other platforms like Databricks. What pushed the edge in Snowflake’s direction?

Molham Aref: This is a very important decision. In the 90s, I was at a company that picked Oracle and Unix as a platform. We were competing against companies that picked Informix or some other relational database. If you don’t get the platform decision right, you can jeopardize your go-to-market motion. From my perspective, we talked to a lot of enterprises, and what we see in practice in the Fortune 500 and the Global 2000 is for SQL and data management, Snowflake is by far the leader. We see them first. We see BigQuery a distant second. Databricks, until recently, didn’t have a SQL offering. They’re everywhere with Spark, but we still don’t see them that much for SQL. For us, it was a really obvious choice to build on Snowflake because they’ve got the traction. Now, there’s a lot of competition, and we’ll see how it all evolves, but my bet is still on Snowflake.

Nataraj: Can you tell us a little bit about your journey of finding your first five customers and what it took to get them?

Molham Aref: It gets progressively easier as you work in the B2B space more. When I was earlier in my career, I didn’t source the customer at all. At HNC, we were selling neural networks. You go to a bank and say, ‘Buy my neural networks.’ The bank goes, ‘What’s a neural network and why would I buy it?’ At some point, they realized that wasn’t effective. It was better to go to a bank and tell them we’re solving a problem they have in their language. If you say, ‘You’re losing $500 million a year in credit card fraud, and if you use our system, you’ll only be losing $200 million,’ any banker’s going to understand that. Then it just becomes a matter of proving you can create those savings. I learned the importance of learning the language of the industry I’m selling into. The folks that are most effective in sales are not the slick talkers; it’s the people who can bring content to a conversation so the prospect doesn’t feel they’re wasting their time. Fortunately, after being at it for a while, you get a multiplicative effect. We had some early customers at Relational AI that used to be customers of mine 15 or 20 years ago at a different startup. Because of the good work we did for them then, there was a level of trust.

Nataraj: Which field did you pick initially and what was the value that you were offering them?

Molham Aref: Starting from graduating college, I liked computer vision and stumbled into an internship at AT&T. Then when I went to HNC, it was because I was interested in neural networks, not particularly in industries. The group I was attached to was selling into retail, manufacturing, and CPG. So you start learning about forecasting problems, supply chain, and merchandising. You learn the language of the industry that way. You have to do the hard work of seeing it from the eyes of your customer.

Nataraj: Right now, what type of customers are you mainly catering to? Is there a sweet spot?

Molham Aref: RelationalAI is more of a horizontal infrastructure play. We are a co-processor to Snowflake. Instead of going to the line of business executive, we’re targeting the Snowflake customer. There’s usually a CTO, CIO, or someone senior who understands infrastructure and data management. To that person, we seem like magic. They spent the last two or three years moving all their data into Snowflake. The last thing they want to do is take that data and pull it back out from Snowflake to put it in a point solution for graphs, rules, predictive, or prescriptive. We come along and say, keep it all there. We plug into that environment. We run inside the security perimeter, same governance. You don’t have to worry about data synchronization because our knowledge graph is just a set of views on the underlying Snowflake tables. We’re relational, which is the same paradigm as Snowflake. So you get something that feels like magic.

Nataraj: When you’re building on top of Snowflake, how do you think about competing or getting cannibalized by Snowflake itself?

Molham Aref: It’s not a new phenomenon. It used to be hardware was the platform. Then operating systems came along. Then Oracle came along. There’s always this tension between the platform and the thing running on it. If the thing running on the platform starts to generate a lot of value, the platform provider can try to make it a feature. You see this all the time. You have to be really good and have a substantial enough solution where it’s either very difficult or very expensive to copy. Look at vector databases. Very hot for about six months, but now it’s a feature in everything. With us, our technology has deep moats. We have a new class of join algorithms, new classes of query optimization, new relational ways of expressing certain things. It creates deep enough moats where I think everyone will have an easier time working with us than trying to compete with us, at least the platform providers.

Nataraj: Can you talk a little bit more about this concept of the modern data stack and where RelationalAI fits into it?

Molham Aref: The modern data stack is a term that came into existence about 10 years ago. It was about the unbundling of data management. There are two ways to make money in our industry: bundling and unbundling. The modern data stack was basically a term used to describe an unbundling of data management so that you could pick different things. You can pick your cloud vendor, your data management platform, your semantic technology, your BI technology. They weren’t coupled together. From that, you had certain things emerge, like Snowflake, DBT, and Looker. It continues now with Open Table Formats and Open Catalogs. I think the next big fight is going to be around semantics and business logic.

Nataraj: What do you mean by business logic and semantics?

Molham Aref: It’s like DBT makes it possible to express semantics in SQL in a way that you can then pick whatever SQL technology you want to run it on. With business logic, you’re kind of tied into certain stacks. A lot of the business logic people write is in procedural programming languages that are not open. If you can capture the semantics of your business logic in a generic, declarative way, then you can map that to whatever is popular that day. A lot of energy is spent managing accidental concerns, not fundamental concerns. If you had your semantics defined in a way where they were not platform dependent, then whatever replaces cloud computing, you would just target that. You separate the business logic from the underlying tech.

Nataraj: As someone who saw machine learning and AI evolve, how do you see the current GenAI hype cycle? What are you excited about?

Molham Aref: GenAI is super exciting. For the first time, we have models that can be trained in general and then have general applicability. Up until GenAI, you built models for specific problems. Now you have models that just learn about the world. I think we are a little bit past the peak of the hype cycle. In the enterprise, what people are finding out is having a model trained about the world doesn’t mean that it knows about your business. What I see happening now is a bit more sobriety and the realization that to have GenAI impact, I need to be able to describe my business to the GenAI model. It doesn’t come with an understanding of my business a priori. We’re starting to see a lot of appreciation for combining that kind of technology with more traditional AI technology like ontologies and symbolic definitions of an enterprise.

Nataraj: Are you leveraging GenAI for your own company? If so, in what ways?

Molham Aref: We don’t build models; we’re not an Anthropic or an OpenAI. Our core competency is how you combine a language model with a knowledge graph to answer questions that can’t be answered otherwise. We’ve been doing work with some of our customers to show how much more powerful GenAI can be if it’s combined with a knowledge graph. Internally, all our developers have the option of using coding copilots. We are exploring some new companies that will make all our internal information searchable via a natural language interface. But we’re still a relatively small company.

Nataraj: You emphasized how sales is perceived in B2B. Can you talk more about your framework for approaching B2B sales?

Molham Aref: I think it’s a mistake for the founders of the company not to take direct responsibility for sales. You have to go out there and do the really hard work of customer engagement and embarrassing yourself to see what really works, what really resonates, and where the pain is. Trying to hire a salesperson to do that for you early on is a huge mistake. Once you’ve figured out what works, now you have the challenge of simplifying it and establishing proof points so it becomes easier for someone who is not as close to the problem or technology to come in and sell it. But even then, you want that person to be able to have a content-rich conversation with a buyer. People are worried about their jobs, their careers, their companies. They want to spend time with people who can really help them.

Nataraj: Where do you see RelationalAI going next?

Molham Aref: We just launched our public preview last June. It’s been amazing, all the inbound customer interest from the Snowflake ecosystem. We have a GA announcement coming out next week. There’s just so much alignment between us, our customers, and our partner Snowflake, that I think we will spend a lot of energy in the next two or three years building a very sizable business there. Beyond that, we will see. I do think intelligent applications represent a great opportunity because they’re still hard to build. If the world starts to appreciate the value of this data-centric, knowledge-graph-based way of building applications, I think we will enjoy serving the market as it figures this out.

Nataraj: What do you consume that forces you to think better?

Molham Aref: I really enjoy reading about history and listening to various historians characterize history at many different scales. There’s a lot to learn from history. It does repeat itself. There’s so much to learn in terms of human beings, our behavior, how we organize, how we get excited about pursuing certain ideas, and how ideas emerge and die. I think there are analogs of that in the enterprise because our job is to motivate a group of people around a mission to go do something great.

Nataraj: Who are the mentors who helped you in your career?

Molham Aref: Many people have been kind and generous. I’ll call out two people. One is Cam Lanier. He was just an amazing guy who passed away earlier this year. He was a great role model of someone who became very successful in business because if he shook hands with you on something, you could take that to the bank. He understood how integrity and trust really drive profit. I’m forever indebted to him for his mentorship. Another person is Bob Muglia. I met Bob after moving to Silicon Valley. He and I connected very much on what we do at RelationalAI. He’s studied the history of the relational movement and how it became dominant. Bob is just an amazing product person and an amazing human being.

Nataraj: What do you know about being a founder or CEO that you wish you knew when you were starting?

Molham Aref: It’s hard. It’s very difficult. This will probably be the last time I do this. I’ve been very fortunate to be part of some very successful ventures. I couldn’t not do this because I’m on this mission to make this kind of software development possible, and I work with some amazing people. But this stuff ages you. It’s really difficult, and you have to be ready for a lot of difficult times. Also, working well with people. A lot of the challenges you have are with people dynamics, creating an environment where you can have a diversity of expertise and skills and have people work together and appreciate each other. That’s super challenging.

Nataraj: Well, that’s a good note to end the podcast on. Thanks, Molham, for coming on the show and sharing your amazing insights.

Molham Aref: Thanks. Thanks for having me.

This deep dive with Molham Aref highlights the shift towards data-centric architectures and the immense opportunity in simplifying complex enterprise workflows. His insights provide a clear roadmap for leveraging modern data platforms to build truly intelligent applications.

→ If you enjoyed this conversation with Molham Aref, listen to the full episode here on Spotify, Apple, or YouTube.
→ Subscribe to ourNewsletter and never miss an update.

November 26, 2025
Read.ai’s Growth to $50M: Founder David Shim on Building an AI Co-Pilot

David Shim is no stranger to the startup world. A repeat founder, former Foursquare CEO, and active investor, he has a deep understanding of what it takes to build and scale a successful tech company. His latest venture, Read.ai, is on a mission to become the ultimate AI co-pilot for every professional, everywhere. Starting as an AI meeting summarizer, Read.ai has rapidly evolved, leveraging unique sentiment and engagement analysis to deliver smarter, more insightful meeting outcomes. The company’s product-led growth has been explosive, attracting over 25,000 new users daily without a dollar spent on media and recently securing a $50 million Series B funding round.

In this conversation, David shares the origin story of Read.ai, detailing how a moment of distraction in a Zoom call sparked the idea. He explains their unique technological approach, which combines video, audio, and metadata to create a richer understanding of meetings than traditional transcription. David also dives into his philosophy on building for a horizontal market, the future of AI agents, and his journey as a founder and investor.

→ Enjoy this conversation with David Shim, on Spotify or Apple.

→ Subscribe to ournewsletter and never miss an update.

Nataraj: You’ve worked and created companies before and after COVID, and you mentioned a lot of your team is remote. Do you have a take on whether remote or hybrid work is better? What is your current sense of what is working best for you when running a company?

David Shim: I’d say hybrid is the future; it’s what works. Where I would say hybrid doesn’t work as well is if you’re really early in your career. It is harder to build those relationships and get that level of mentorship on a fully remote basis. It’s not to say that it’s impossible, but it is a lot more difficult. When you’re in an office, you have that serendipity of meeting people and building connections. When you’re fully remote, especially early in your career, you don’t know who to ask beyond your manager and your immediate team.

That said, once you’re more senior, I think it becomes easier to be fully remote. You know what to do, who to talk with, and you’re not afraid to break down walls. I think Reed’s is the same way. We’ve got a third of our team fully remote, and then people come into the office Tuesday, Wednesdays, and Thursdays. We let people come in on Mondays and Fridays, but they don’t have to. What’s really happening is people actually like that level of interaction, so they’re coming in without being required.

Nataraj: Let’s get right into Read.ai. Great name, great domain name. Talk to me a little bit about how the company started. What was the original idea?

David Shim: The original idea started when I was in a meeting. After I’d left Foursquare as CEO, I had a lot of time on my hands. It was still peak COVID, so no one was meeting in person. I was doing a lot of calls, either giving advice or considering investments. What I started to realize was within two or three minutes of a call, you know if you should be there or not. I’d think, ‘I should not have been invited to this meeting. Why am I here?’ But you can’t just leave. So, like most people, I’d surf the web or answer emails.

One time, I noticed a color on my screen that matched someone else’s screen. I looked closer and saw a reflection in their glasses. It was the same image I could see on ESPN.com. That triggered an idea: can you use not just audio, but video to understand sentiment and engagement? Can I determine if someone is engaged on a call? It wasn’t about being ‘big brother,’ but about identifying wasted time. In a large company, you can invite 12 people to a meeting, and they’ll all accept, potentially wasting 12 hours if they didn’t need to be there. So, the idea started to form around using this data to optimize productivity.

Nataraj: So were you analyzing video and text at that point, or just text?

David Shim: Video and text. Transcription companies already existed, as well as platforms like Zoom and Microsoft that had transcription built-in. I didn’t want to build something that everybody else already had and just make it a little bit better. I wanted something that was a step-function change. So we said, let’s take the existing transcripts but apply sentiment and engagement to them. Think about it: David said this, but Nataraj responded this way. That narration piece is missing. Our AI can go in and say, ‘This is how the person reacted to the words.’ Now, an LLM not only has the quotes that were said but how individual people reacted. It could say, ‘The CEO was really skeptical based on his facial expressions and became disinterested 15 minutes into the call.’ You can’t pick that up from quotes, but you can from visual cues.

Nataraj: What’s the main value that the customers got from Read.ai at that point?

David Shim: In 2022, we launched the product with real-time analytics, showing scores for sentiment and engagement. People found it interesting and valuable. But what was missing was stickiness. People would say, ‘You’re telling me this call is going really bad, but what do I do?’ You’re not giving me advice. That’s when the larger language models came out at scale in late 2022. We tested them and wondered if our unique narration layer, when applied to the text of the conversation, would create a materially different summary. The answer was yes. Comparing a summary with our narration layer to one without, it was totally different. You want to put what everyone is paying attention to at the top of the summary, and getting that reaction layer really changed the quality of a meeting summary.

We started 2023 as the number 20 meeting note-taker in the world. Now we’re number two, and we’re within shooting distance of number one. To go from number 20 to number two in less than 18 months highlights that there’s a difference in our approach, methodology, and the quality of the product.

Nataraj: And how did you acquire your customers in these 18 months? Was it inbound, outbound? Did you target a certain segment?

David Shim: Many VCs say to pick a specific niche and go vertical. My take was that this is such a big market. If this is a seminal moment where everyone’s going to require an AI assistant, it means everyone from an engineer at Google to a teacher to an auto mechanic will need it. So we went horizontal versus vertical. That has helped a lot from a product-led growth motion. We’re adding over 25,000 to 30,000 net new users every single day without spending a dollar on media. It’s pure word of mouth. If you see the product, people will use it, talk about it, and share it.

Nataraj: Is that because if you’re on a meeting with someone, the other people see it being used and then they buy it? The product inherently has that viral aspect, right?

David Shim: 100%. Meetings are natively multiplayer. The problem now is, ‘How do I get access to those reports?’ We’ve made it really simple for the owner to share it just by typing in an email address, pushing it to Jira, Confluence, or Notion. We’re not trying to be a platform where everyone has to consume the data. This is where ‘co-pilot everywhere’ comes into play. We want to push it wherever you work. You can see the data on a Confluence page or a Salesforce opportunity that has better notes than the seller ever created. At the bottom, it says, ‘Generated by Read,’ and you wonder, ‘What is this thing?’ That bottoms-up motion has driven a lot of our growth.

Nataraj: I can almost see this becoming an ever-present co-pilot in a work setting that will change productivity for knowledge workers. Is that the vision where you’re going?

David Shim: That’s exactly what we’re thinking from a ‘co-pilot everywhere’ perspective. When you think about the current state, it’s about pushing data to different platforms. But you also need to pull data down. For example, Read doesn’t treat your first meeting on a topic and your tenth meeting as silos; it aggregates them to give you a status report on how a topic is progressing. Three months ago, we introduced readouts that include emails and Slack messages. Now it’s truly a co-pilot everywhere, not just for your meeting. The adoption becomes incredible because you don’t have to log into Gmail, Salesforce, and Zoom separately. It’s just right there.

Nataraj: You’re still thinking breadth-first, or are you now targeting what a Fortune 500 company wants versus an SMB?

David Shim: We’re still horizontal, but we’re picking specific verticals based on customer demand, like sales, engineering, product management, and recruiting. That’s why we did integrations with Notion, Jira, Confluence, and Slack. Another way to look at ‘co-pilot everywhere’ is agents. Everyone’s talking about AI agents, but in reality, you want your Jira to talk with your Notion, to talk with your Microsoft, to talk with your Google. That’s the push and pull of data between integrations. I think that is going to be the next big space.

Nataraj: What about the foundation models you’re using? Are you held in control by their pricing?

David Shim: We are not held in control. Last month, 90% of our processing was our own proprietary models. We use large language models for the last mile—taking the interesting topics and reactions we found and putting them into a readable sentence or paragraph. But we’re building our own models that cluster groups of data together, identify the subject matter, and then we go to the LLMs and say, ‘Summarize this for us.’ 90% of our processing cost is our own internal models. We have five issued patents now with more pending. We’re not just a wrapper layer with good prompts; that’s not a defensible moat.

Nataraj: What do you think about the whole trend of agents? Are we seeing any real use cases?

David Shim: I think it’s early. It’s the same way with voice. Voice is interesting, but if you look at Alexa or Siri, they had massive early scale and then kind of dropped off. It’s an important play, but it’s a feature, not the whole product. With agents, it’s the same thing. It’s not that simple to say, ‘Go search for flights and find me the best one.’ You need to know what to ask for. What dates? Are you using miles? An agent in theory could do that, but you still need to upload the training data. I think the agents working in the background are more likely to succeed, where someone has trained data on how to handle specific use cases. But for the consumer, they’re not going to know what an agent is, just like most consumers don’t know what an API is.

Nataraj: What is the vision for Read.ai for the next couple of years?

David Shim: In the next one to two years, it’s diving further into ‘co-pilot everywhere.’ We’re adding more native integrations with both push and pull capabilities. Where we want to get to is optimization. I’ve got your emails, messages, and meetings. If you’re a seller, as you have each call, I can go into Salesforce and update the probability of a deal from 25% to 50%, then 75%. We can push a draft to the seller saying, ‘We think this opportunity should go to 75%,’ and include the quote from the meeting that justifies it. Now, what was the most hated function for a seller—updating Salesforce—becomes an automated process where they just swipe right or left. That’s the level of optimization people will ask for.

Nataraj: I want to slightly shift gears and talk about your investing. What’s your general thesis?

David Shim: On the venture side, my thesis is if you believe in me enough to invest in my company, I should have the same belief to invest in your VC fund. If you’re a portfolio company, they’ll often give you access for a lower amount. I think every founder should take advantage of that. When I do angel investing, it’s one of two things. One, anyone I know that I’ve worked with before who asks me to invest, I’m more likely to say yes. It’s about giving back that same level of trust. The second is for more interesting opportunities that come up on my radar where I feel they have something novel that can deliver outsized returns.

Nataraj: What do you know about being a founder that you wish you knew when you were starting?

David Shim: At my first company, Placed, I was a solo founder. That is very expensive on your time, stress level, and relationships. You have nobody else to go to. I would say, don’t force it. If you can find co-founders that you trust and work with really well, do it. With Read.ai, my co-founders Elliot and Rob have been incredible. It distributes the work, stress, and knowledge. When you have three really smart people coming back together with different ideas, you can ideate better. So for any founders out there, if the opportunity exists, go with a co-founder versus solo.

From an investor standpoint, outside of your own startup, don’t over-index on anything. Whatever is hot will stay hot for a little bit, but it will almost always drop off. Be careful about over-indexing. A lot of times, just put it in an index fund. The S&P 500 is up 25%—that’s better than most VC IRR on a yearly basis, and it’s liquid.

This conversation offers a masterclass in building a modern AI company, highlighting the importance of a unique technological moat, a powerful product-led growth engine, and a clear vision for the future. David’s journey provides valuable lessons for any founder navigating the AI landscape.

→ If you enjoyed this conversation with David Shim, listen to the full episode here on Spotify or Apple.

→ Subscribe to ourNewsletter and never miss an update.

November 26, 2025
Glean AI Founder Arvind Jain on the Future of Enterprise AI Agents

Arvind Jain, CEO of Glean AI and co-founder of the multi-billion dollar company Rubrik, is a veteran of Silicon Valley’s most demanding engineering environments. After a decade as a distinguished engineer at Google, he experienced firsthand the productivity ceiling that fast-growing companies hit when internal knowledge becomes fragmented and inaccessible. This pain point led him to create Glean AI, initially conceived as a “Google for your workplace.” In this conversation with Nataraj, Arvind discusses Glean’s evolution from an advanced enterprise search tool into a sophisticated conversational AI assistant and agent platform. He dives into the technical challenges of building reliable AI for business, how companies are deploying AI agents across sales, legal, and engineering, and his vision for a future where proactive AI companions are embedded into our daily workflows. He also shares valuable lessons on company building and fostering an AI-first culture.

👉 Subscribe to the podcast: startupproject.substack.com

Nataraj: My wife’s company actually uses Glean, so I was playing around to prepare for this conversation. But for most people, if their company is not using it, they might not be aware of what Glean is and how it works. Can you give a pitch of what Glean does today and how it is helping enterprises?

Arvind Jain: Most simply, think of Glean as ChatGPT, but inside your company. It’s a conversational AI assistant. Employees can go to Glean and ask any questions that they have, and Glean will answer those questions for them using all of their internal company context, data, and information, as well as all of the world’s knowledge.

The only difference between ChatGPT and Glean is that while ChatGPT is great and knows everything about the world’s knowledge, it doesn’t know anything internally about your company—who the different people are, what the different projects are, who’s working on what. That context is not available in ChatGPT, and that’s the additional power that Glean has. That’s the core of what we do. We started out as a search company. Before these AI models got so good, we didn’t have the ability to take people’s questions and just produce the right answers back for them using all of that internal and external knowledge. In the past, I would call ourselves more like Google for your workplace, where you would ask questions, and we’ll surface the right information. But as the AI got better, we got the ability to actually go and read that knowledge and instead of pointing you to 10 different links for relevant content, we could just give you the answer right away. That’s the evolution of how we went from being a Google for your workplace to being a ChatGPT for your workplace. We’re also an AI agent platform. The same underlying platform that powers our ChatGPT-like experience is also available to our customers to build all kinds of AI agents across their different functions and departments and ensure that they’re delivering AI in a safe and secure way to their employees.

Nataraj: You started in 2019 as an AI search company. Now, it feels very natural to build a ChatGPT-like product for enterprise because the value is instantaneous. But why did you pick the problem of solving enterprise AI search back then? It was not the hot thing or an obvious problem. What was your initial thesis?

Arvind Jain: For me, it was obvious because I was suffering from that pain. Before Glean, I was one of the founders of Rubrik. We had great success and grew very fast; in four years, we had more than 1,500 people. As we grew, we ran into a productivity problem. There was one year where we had doubled our engineering team and tripled our sales force, but our metrics—how much code we were writing, how fast we were releasing software—were flatlining. We just couldn’t produce more, no matter how many people we had.

One key reason was that the company grew so fast, and there was so much knowledge and information fragmented across many different systems. Our employees were complaining that they couldn’t find the information needed to do their jobs. They also didn’t know who to ask for help because there was no concept of who was working on what. When we saw this as the number one problem, I decided to solve it. My first instinct as a search engineer was to just go and buy a search product that could connect to all of our hundred different systems. That revealed to us that there was nothing to buy. There was no product on the market that would connect to all our SaaS applications and give people one place where they could simply ask their questions and get the right information. That was the origin. I felt nobody had tried to solve the search problem inside businesses, even though Google solved it in the consumer world. That got me excited. At that time, we were not thinking about building a ChatGPT-like experience; nobody knew how fast AI would evolve.

Nataraj: I think almost pre-ChatGPT no one called AI as AI; it was called ML or some other technical term. I remember watching Google’s Pixel phone launches in 2020-2021, and they were doing a lot of work creating AI-first products very early on. But for some reason, the tragedy is Google is considered as not doing enough with AI. That was a narrative versus experience difference.

Arvind Jain: In 2021, we launched our company to the public and we called ourselves the Work AI Assistant. We didn’t call ourselves a search product because we could do more than search. We could answer questions and be proactive. But it was a big problem from a marketing perspective because nobody understood what an assistant was. Nobody had really seen ChatGPT. It was a big failure, and we rebranded ourselves as a search company. Then, of course, with ChatGPT launching, people realized how capable AI is and that it can really be a companion, which is when we came back to our original vision.

Nataraj: One CEO I spoke with mentioned that when you pick a really hard problem to work on, a couple of things become easier. It’s easier to convince investors because the returns will be very high if you’re successful, and you can attract people who want to solve hard problems. What’s your take on picking a problem when starting a company?

Arvind Jain: I agree with that assessment. It’s not that you’re just trying to pick something super hard to solve as the main criterion. The main criterion still has to be that you add value and build a useful product. I’m always attracted to working on problems that are very universal, where we can bring a product to everybody. I like it both because of the impact you’re going to make and because building a startup is a difficult journey. You have to have something that makes you go through that, and for me, that something is impact—solving a problem that builds a product useful to a very large number of people.

Second, when you think about solving problems, you have to think about your strengths. If you are a technologist, it’s a gift if the problem you’re trying to solve is a difficult one because you’ll be able to build that technology with the best team, and you won’t get commoditized quickly. With Search, I knew how hard and difficult the problem is. That was definitely an exciting part of why I started Glean—I knew that if we solved the problem, it would be a super useful product and a technology that others wouldn’t be able to replicate quickly.

Nataraj: One thing I often see with tools like ChatGPT or Glean AI in the enterprise context is that when you’re working on certain types of data, it’s not enough to be 90% accurate. If I’m reporting revenue numbers to my leadership, I want it to be 99.9% accurate. Can you talk a little bit about the techniques you are using to reduce hallucination?

Arvind Jain: AI is progressing quite quickly. There’s a lot of work that the platforms we use, like OpenAI, Anthropic, and Google, are doing. The models today are significantly different from the models we had last year in terms of their ability to reason, think, and review their own work, giving you more confident, higher accuracy answers. There’s a general improvement at the model layer, which is reducing hallucinations significantly.

Then, coming into the enterprise, none of these models know anything about your company. When you solve for specific business tasks, the typical workflow is that you have a model that is thinking and retrieving information from your different enterprise systems. It uses that as the source of truth to perform its work. It becomes very important for your AI product to ensure that for any given task, you are picking the right, up-to-date, high-quality information written by subject matter experts. Otherwise, you end up with garbage in, garbage out. That is what most people are struggling with right now. They build AI applications, dabble in retrieving information, and then complain to their customers that their data is bad. That’s not the right answer because AI should be smart enough to understand what information is old versus new. As a human, you have judgment. You look for recent information. If you can’t find it, you talk to an expert. AI has to work the same way, and that is what Glean does. We connect with all the different systems, understand knowledge at a deep level, identify what is high quality and fresh, and ensure that models are being provided the right input so they can produce the right output. Our entire company is focused on that.

Nataraj: You mentioned an AI agent platform. What are the typical use cases for which enterprises are creating agents?

Arvind Jain: I’ll pick some key ones across a few departments. For sales teams, much of their time is spent on prospecting and lead generation. You can build a really good AI agent that does that faster and with higher quality than a human in many cases. People have built an agent on Glean where a salesperson says, “I would like to prospect these five accounts today,” and Glean will do a good amount of research, identify the right contacts, and generate personalized outreach messages. Our salespeople then review the work of AI with a thumbs up or thumbs down, and the messages get sent out. They can now prospect at a rate five times greater than before. Similarly, after a customer call, an agent can generate the meeting follow-up with action items and supporting materials, a task that used to take hours.

For customer service, the job is to answer customer questions and help with support tickets. AI is pretty good at that. People have built agents to auto-resolve tickets. For engineering teams, AI can be a really good code reviewer. The Glean AI code review agent is quite popular; it’s the first one to review any code an engineer uploads and can handle basics like following style guides. The use cases are exploding. Last year it was all about engineering and customer support, but now it’s all departments. Legal teams are using a redlining agent that automatically creates the first version of redlines on third-party papers like MSAs or NDAs. It’s a huge time and cost saver. The democratization is happening now.

Nataraj: It feels like a better way to describe agents is as ‘workflow agents,’ similar to Zapier but with an intelligence layer. This can only work if you’re integrated well with different apps, and today every company uses hundreds of SaaS tools. Can you talk about that challenge?

Arvind Jain: You’re spot on. Agents have to work on your enterprise data, use model intelligence to mimic human work, and take actions in your enterprise systems. There’s a strong dependence on your ability to both read information and take actions. The good news for Glean is that we’ve been working on that for the last six and a half years. We have hundreds of these integrations and thousands of actions we can support, which becomes the raw material for building these agents.

It’s interesting how hard it is to get that to work because enterprise systems are very bespoke. One major challenge is security and governance. You can’t have an agent platform where agents just read any data from any system. You have to follow the governance architecture and rules inside the company, like permissions and access control. You have to not only build these integrations but also work upwards from that to handle agent security and ensure you deliver the right data to these agents, not stale or out-of-date information.

Nataraj: We’ve seen a few form factors: the chat bar, then RAG on the engineering side, and now everyone is talking about agents. What is the next form factor or use case you see coming up?

Arvind Jain: One big shift from the initial ChatGPT-like experience, which is very conversational and reactive, is that agents are becoming more proactive. You can build an agent that runs every day or when a certain trigger condition is met. The next big thing I see is AI becoming even more proactive and embedded in your day-to-day life. You won’t think of AI as a tool you go to; it will just come to you when it detects you need help.

Our vision for the future of work is that every person will have an incredible personal companion. A companion that knows everything about you and your work life: your role, your company, your OKRs, your career ambitions, your weekly tasks, your daily schedule. It’s walking with you, listening to every word you say and hear. With all that deep knowledge, it’s ready to help proactively. For example, imagine I’m commuting to work. My companion detects I’m unprepared for my meetings. It knows the commute is 38 minutes, so it can offer to brief me as I drive, summarizing the documents I need to read so I feel prepared for my day. That’s where we are headed. AI is going to become a lot more proactive.

Nataraj: Does that mean Glean is going into cross-platform and cross-application to make us more productive? I can imagine a floating bubble on my mobile where I can just hit a button and narrate a task.

Arvind Jain: Absolutely. We already have these different interfaces. Glean works on your devices—we have an iOS app and an Android app—and it gets embedded in other applications. If you’re building the world’s best assistant or companion for everybody at work, you have to travel with them. From a form factor perspective, you’re going to see more interesting devices, whether it’s a smartwatch or a smart pen. Our goal would be to make sure we’re running on them.

Nataraj: I want to shift gears and talk about the business. You mentioned a marketing failure pre-ChatGPT, then a rebrand. Now that you’re a fast-growing company, with AI increasing productivity, does that mean you’re hiring less? If you had X salespeople at Rubrik, are you hiring fewer now for the same level of growth?

Arvind Jain: First, a company is a group of people building something together. I firmly believe the scale of your business is proportional to the number of people you have. I don’t personally believe I can have a five-person company and generate a billion dollars. The productivity per employee is going to grow at a relatively linear pace. It’s just that to survive as a company, you have to do 10 times more work than you did before with the same number of people, because everyone is benefiting from AI.

You have to be able to build products and experiences we couldn’t dream of before. You shouldn’t be thinking, “Can I have fewer people?” You have to think, “How do I achieve more with the number of people I can absorb?” You don’t have a choice. If you deliver the same kind of products as pre-AI, you won’t survive. We are growing very fast and investing in our people. We fundamentally believe the larger we are, the more we’ll be able to do. But at the same time, I’m a minimalist. I always try to ensure we are enabling every employee with the right tools and that they are fully capitalizing on AI to deliver way more than expected in the pre-AI world.

Nataraj: What does it mean to be more AI-first? Do you do more AI education or align incentives?

Arvind Jain: We started by just talking about the importance of AI in town halls. I don’t think we saw the results because people were too busy. Then we tried setting goals like “get 20% more productive,” which was a complete failure. Our third iteration was to just do one thing with AI. We don’t care about the ROI; just show that you’re trying to learn and get one meaningful thing done. That’s the top-down approach. From a bottom-up perspective, we allow people to bring in the right AI tools and we celebrate wins. We created a program called “Glean on Glean.” Every new hire, for their first month, ignores their hired role and instead plays with AI tools to build one workflow or agent. It’s been very effective, especially for new grads who don’t know the traditional way of working and are more well-versed with AI.

Nataraj: What are one or two metrics you consistently watch that tell you whether you’re going in the right direction?

Arvind Jain: For us, number one is customer satisfaction. We look at user engagement—how often our users use the product on a daily basis. That’s the most important metric. Number two, on the product side, we look at the type of things people are trying to do with it and if that set is expanding. For example, are more people becoming creators on Glean and building different sets of agents? From the business side, we look at standard metrics like retention rate and tracking our pipeline for demand. But as a CEO, probably the most important thing to watch is how our organization is feeling internally. What are the signs from the team? Are we ensuring we have mission alignment? Are people committed and motivated? Are we creating the right environment for them to grow and succeed? Those are the top-of-mind things for me.

This conversation with Arvind Jain offers a clear look into how enterprise AI is moving beyond simple chat interfaces to create tangible value through sophisticated workflow agents. His insights provide a roadmap for how businesses can leverage AI to solve core productivity challenges.

→ If you enjoyed this conversation with Arvind Jain, listen to the full episode here on Spotify, Apple, or YouTube.
→ Subscribe to our Newsletter and never miss an update.

November 23, 2025
Decagon’s Ashwin Sreenivas: Building a $1.5B AI Support Giant

At just under 30, co-founders Jesse Zhang and Ashwin Sreenivas have built Decagon into one of the fastest-growing AI companies, achieving a $1.5B valuation in just over a year out of stealth. Backed by industry giants like Accel and Andreessen Horowitz, Decagon is redefining enterprise-grade customer support with its advanced AI agents, earning a spot on Forbes’ prestigious AI 50 list for 2025. In this episode of the Startup Project, host Nataraj sits down with Ashwin to explore the secrets behind their explosive growth. They discuss how Decagon moved beyond the rigid, decision-tree-based chatbots of the past by creating AI agents that follow complex instructions, how they found product-market fit by tackling intricate enterprise workflows, and the company’s long-term vision to build AI concierges that transform customer interaction.

👉 Subscribe to the podcast: startupproject.substack.com

Nataraj: So let’s get right into it. What is Decagon AI? What does the product do, and talk a little bit about the technology behind Decagon.

Ashwin Sreenivas: You can think of Decagon as an AI customer support agent. For our customers, Decagon talks directly to their customers and has great conversations with them over chat, phone calls, SMS, and email. Our goal is to build these AI concierges for these customers. This idea of AI for customer support isn’t necessarily new; you’ve had chatbots for 10 years now, probably. But the thing that’s really different this time is if you look at the chatbots from as late as three or four years ago, it wasn’t a great experience. The reason it wasn’t a great experience is because you had these decision trees that everybody had to build, and it was a pain to build them and a pain to maintain. From a customer perspective, if you have a question or a problem that is one degree off from the decision tree that was built out, it was completely useless. That’s when you have people saying, “agent, agent, agent.” The thing that’s changed, and a lot of the core of what we’ve built, is a way to train these AI agents like humans are trained. Humans have standard operating procedures that they follow, and our AI agents have agent operating procedures that they follow. We’re able to essentially build these AI agents that can have much more fluid, natural conversations like a human agent would.

Nataraj: Talking a little bit about the products, you mentioned chat, phone calls, emails. Do you have products for everything? If a company is coming to adopt Decagon, are they first starting with chat and then expanding to everything else? How does the customer journey look?

Ashwin Sreenivas: This is actually very driven by our customers. For a lot of the more tech-native brands, think like a Notion or a Figma, you would never think about picking up the phone and calling them. You’d want to chat or email. Whereas some of our other customers like Hertz, you don’t really email Hertz. If you need a car, you’re going to call them up on the phone. So a lot of our deployment model is guided by our customers and how their customers want to reach out to them. Typically, most customers start with the method by which most of their customers reach out, and then they expand to all the other ones. It’s very common to start with chat and then expand to email and voice, or start with voice and expand to chat and email.

Nataraj: I want to double-click on the point you mentioned about the decision tree model. I think around 2015, during the Alexa peak, everyone was building chatbots. I remember the app ecosystem where you had to build apps on Alexa or Microsoft’s Cortana. Conversational bots were the hype for two or three years, but they quickly stagnated when we realized all we were doing was replicating the “press one for this, press two for this” system on a chat interface. You define a decision tree, and anything outside of that is basically an if-else command line that ends with a catch-all driving you to a human. There are obviously a lot of players in customer support with existing tools. Do they have a specific edge on creating something like what Decagon is doing because of their existing data?

Ashwin Sreenivas: No, I actually think, interestingly enough, because these customer service bots went through a few generations of tech, the tech is different enough that you don’t get too much of an advantage starting with the old tech. In fact, you start with a lot of tech debt that you then have to undo. Let’s say 10 years ago, you had to start with explicit decision trees where you program every single line. Then about five years ago, you had the Alexas of the world. It was a little bit of an improvement, but essentially all it did was allow a user to express what they want. They could say, “I want to return my order,” and the models were good at detecting intent—classifying a natural language inquiry into one of 50 things it knew how to do. But beyond that, everything became decision trees. The thing is now with these new models, because you have so much flexibility and the ability for them to follow more complex instructions and multi-step workflows, you can actually rebuild this from the ground up. It’s not just classifying an intent and then following a decision tree; we want the whole thing to be much more interactive for a better user experience. We had to rebuild it to ask, how does a human being learn? You have standard operating procedures. You say, “Hey, if a customer asks to return their order, first check this database to see what tier of customer they are. If they’re a platinum customer, you have a more generous return policy. If they’re not, you have a stricter one. You need to check the fraud database.” You go through many of these steps and then work with the customer. The core of what we’ve done is build out AI agents that can follow instructions very well, like a human does.

Nataraj: This whole concept of AOPs (Agent Operating Procedures) that you guys introduced is very fascinating. You mentioned SOPs, which humans read, and then you have AOPs, which is sort of a protocol for the agent. Who is converting the SOP into an AOP? How easy is it to create this agent? Are you giving a generic agent that adapts to a customer’s SOP, or do I as a customer have to build the agent?

Ashwin Sreenivas: The core Decagon product is one agent that is very good at following instructions and AOPs. We built this for time to value. If you have to train an agent from scratch for every single customer, it’s going to take a lot of time for that customer to get onboarded. And two, it’s very difficult for that customer to iterate on their experiences. If you build one agent, like a human, that’s very good at following instructions, they can come to that customer and say, “Here are the instructions I need to follow,” and you can be up and running immediately. In terms of how these AOPs are created, most customers tend to have some set of SOPs already, and AOPs are actually extremely close to these. The only thing you need to change is you need to instruct it on how to use a company’s internal systems. It’s 99% English, and then there are a few keywords to tell it, “At this point, you need to call this API endpoint to load the user’s details,” or “At this point, you need to issue a refund using the Stripe endpoint.” That’s the primary difference from SOPs.

Nataraj: If you talk about the technology stack, are you using external models, or are you training your own models? What is the difference between a base model and what you’re delivering to a customer?

Ashwin Sreenivas: We spend a lot of time thinking about models. We do use some external models, but we also train a lot of models in-house. The reason is, if you’re using external models, most of what you can do is through prompt tuning, and we found that models are only so steerable with just prompt tuning. We’ve spent a lot of time in-house taking open-source models and fine-tuning them, using RL on top of them, and using all of these techniques to steer them. To get these models to follow instructions well, you have to decompose the task. A customer comes in with a question, and I have all of these AOPs I could select from. The first decision is, is any of these AOPs relevant? If a user is continuing the conversation, are they on the same topic or should I switch to another AOP? At every step, there are a hundred micro-decisions to make. A lot of what we do is break down these micro-decisions and have models that are very, very good at each one.

Nataraj: The industry narrative has been that only companies with very large capital can train models. Are you seeing that cost drop? When you mentioned you’re training open-source models, is that becoming more accessible?

Ashwin Sreenivas: We’re not pre-training our models from scratch. We take open-source models and then do things on top of those. The thing that has changed dramatically is that the quality of the open-source models has gotten so good that this is now viable to do pretty quickly.

Nataraj: Which models are better for your use case?

Ashwin Sreenivas: We use a whole mix of models for different things because we found that different base models perform differently for different tasks. The Google Gemma models are great at very specific things. The Llama models are great at very specific things. The Qwen models are great at very specific things. Even for one customer service message that comes in, it’s not one message going to one model. It’s one message going to a whole sequence of models, each of which is good at doing different things to finally generate the final response.

Nataraj: It’s often debated that as bigger models like GPT-5 or Gemini improve, they will gain the specialized capabilities that smaller, fine-tuned models have. What is the reality you’re seeing?

Ashwin Sreenivas: I would push back against that argument for two reasons. Number one, while the bigger models will all have the capabilities, the level of performance will change. If you have a well-defined task, you can have a model that’s 100 times smaller achieve a higher degree of performance if you just fine-tune it on that one task. I don’t want it to code in Python and write poems; I just want it to get really good at this one thing. When measured on that one task, it will probably outperform models a hundred times its size. Number two, which is equally important, is latency. A giant model might take five seconds to generate a response. A really small model cuts that time by a factor of 10. Over text, five seconds might not matter, but on a phone call, if it’s silent for five seconds, that’s a really bad experience. For that, you have to go toward the smaller models.

Nataraj: Can you talk about why you and your co-founder picked customer service as a segment when you decided to start a company?

Ashwin Sreenivas: When we started this company, it was around the time when GPT-3.5 Turbo and GPT-4 were out. We were looking at the capabilities and thought, wow, this is getting just about good enough that it can start doing things that people do. As we looked at the enterprise, we asked, where is there a lot of demand for repetitive, text-based tasks? Number one was customer support teams, and number two was operations teams. As we talked to operations leaders, the number one demand was in customer support. They told us, “Look, we’re growing so quickly, our customer support volume is scaling really quickly, which means we need to hire a lot more people, and we can’t afford to do that. We are desperate.” Initially, it looked like a very crowded space, but as we talked to customers, we found it was crowded for smaller companies with simple tasks, where 90% of their volume was, “I want to return my order.” But for more complex enterprises, there wasn’t anything built that could really follow their intricate support flows. That was the wedge we took—to build exclusively for companies with very complex workflows. The other thing that was interesting was our long-term thinking. If you build an agent that can instruction-follow very well, you enable businesses to eventually grow this from customer support into a customer concierge.

What I mean by that is, let’s say you want to fly from San Francisco to New York. You go to your favorite airline’s website, type in your search, and it gives you 30 different flights to pick from. That’s a lot of annoying steps. A much better experience would be to text your airline and say, “I want to go to New York next weekend.” An AI agent on the other side knows who you are, your preferences, and your budget. It looks through everything and says, “Hey, here are two options, which one do you like?” This AI agent also knows where you like to sit and says, “By the way, I have a free upgrade available for you. Is that okay?” You say yes, and it says, “Booked.” The big difference is this is a much more seamless experience. Most websites today shift the burden of work onto the user. Now, it shifts to a world where you express your intent to an AI agent that then does the work for you. That was a really interesting shift for us. Building these customer support agents is the first step to building these broader customer concierges.

Nataraj: How did you acquire your first five customers? What did that journey look like?

Ashwin Sreenivas: Early customer acquisition is always very manual. There’s no silver bullet. It’s just a lot of finding everyone in your networks, getting introductions, and doing cold emailing and cold LinkedIn messaging. It’s brute force work. But the other thing for us is we never did free design pilots; we charged for our software from day one. This doesn’t mean we charged them on day one of the contract. We’d typically say, “There’ll be a four-week pilot. At the end of four weeks, we’ll decide upfront if you like it, this is what it’s going to cost.” We never had an open-ended, long-term period where we did things for free because, in the early days, the number one thing you’re trying to validate is, am I building something that people will pay money for? If it’s truly valuable, you should be able to tell your potential customer, “Hey, if I accomplish A, B, and C, will you pay me this much in four weeks?” If it’s a painful enough problem, they should say yes. This helped us weed through bad business models and bad initial ideas quickly.

Nataraj: What business impact and success metrics do your customers look at when using Decagon?

Ashwin Sreenivas: Customers think about value in two ways primarily. One is what percentage of conversations we are able to handle ourselves successfully—meaning the user is satisfied and we have actually solved their problem. If we can solve a greater percentage of those, fewer support tickets ultimately make their way to human agents, who can then focus their time on more complicated problems. The second benefit, which was a little counterintuitive, was that a lot of these companies expanded the amount of support they offered. It’s not that companies want to minimize support; they want to give as much as they can economically. If it cost me $10 for one customer interaction and all of a sudden that becomes 80 cents, I’m not just going to save all that money. I’m going to reinvest some of that in providing more support. We’ve noticed that their end customers actually want that increased level of support. So now, instead of phone lines being open only from 9 a.m. to 5 p.m., it becomes 24 hours a day. Instead of offering support only to paid members, we offer support to everybody. There’s this latent demand for increased support, and by making it much cheaper, businesses can now offer more. At the end of the day, this leads to higher retention and better customer happiness.

Nataraj: You also have support for voice agents, which is particularly interesting. What has the traction been like? Do customers realize they’re talking to an AI?

Ashwin Sreenivas: In general, all our voice agents say, “Hi, I’m a virtual agent here to help you” or something like that. But the other interesting thing is most customers calling about a problem don’t want to talk to a human; they want their problem solved. They don’t care how, they just want it solved. For us, making it sound more human is not about giving the impression they’re talking to a human; it’s to make the interaction feel more seamless. You want responses to be fast. At the end of the day, the primary goal is, how can we solve the customer’s problem? Even if the customer is very aware they’re talking to an AI agent, but that agent solves their problem in 10 seconds, that’s a good experience. Versus talking to a human who takes 45 minutes, which is a bad experience. We have several customers now where the NPS for the voice agents is as good or higher than human agents because if the AI agent can solve their problem, it solves it immediately. And if it can’t, it hands it over to a human immediately. Either way, you end up having a reasonably good experience.

Nataraj: Has there been a drop in hiring in support departments? Are agents replacing humans or augmenting them?

Ashwin Sreenivas: It really depends on the business. If AI agents can handle a bigger chunk of customer inquiries, you can do one of three things. One, you can handle more incoming support volume. You put it on every page, you give support to every member, you do it 24 hours a day. Your top-line support volume will go up, but your customers have a better experience, and you can keep the number of human agents the same. Other people might say, “I’m going to keep the amount of customer support I do the same. There are fewer tickets going to human agents, so now I can have those agents do other higher-value things,” like go through the high-priority queue more quickly or move to a different part of the operations team.

Nataraj: Can you talk about the UX of the product? People have different definitions of agents. What kind of agent are we talking about here?

Ashwin Sreenivas: Interacting with Decagon is exactly like interacting with a human being. From the end user’s perspective, it’s as though they were talking to a human over a chat screen or on the phone. Behind the scenes, the way Decagon works is that each business has a set of AOPs that these AI agents have access to. The AOPs allow the agents to do different things—refund an order, upgrade a subscription, change billing dates. The Decagon agent is just saying, “Okay, this question has come in. Do I need to work through an AOP with the customer to solve this problem?” And it executes the AOPs behind the scenes.

Nataraj: Before your product, a support manager would look at their team’s activities. How does that management look now on your customer’s side?

Ashwin Sreenivas: There’s been an interesting shift. Rather than training new human agents, I’ve trained this AI agent once, and now my job becomes, how can I improve this agent very quickly? We ended up building a number of things in the product to support this. If the AI agent had one million conversations this month, no human can read through all of that. We had to build a lot of product to answer, what went well? What went poorly? What feedback should I take to the rest of the business? What should I now teach the agent so that instead of handling 80% of conversations, it can handle 85%? The primary workflow of the support manager has changed from supervising to being more of an investigator and agent improver, asking, “What didn’t go well and how can I improve that?”

Nataraj: Are the learnings from one mature customer flowing back into the overall agent that you’re building for all companies?

Ashwin Sreenivas: We don’t take learnings from one customer and apply them to another because most of our customers are enterprises, and we have very strong data and model training guarantees. But the learning we can take is what kinds of things people need these agents to do. For instance, we learned early on that sometimes an asynchronous task needs to happen. Decagon didn’t have support for that, so we realized that use case was important and extended the agent to be able to do tasks like that. It’s those kinds of learnings on how agents are constructed that we can take cross-customer. But for a lot of these customers, the way they do customer service is a big part of their secret sauce, so we have very strong guarantees on data isolation.

Nataraj: How are you acquiring customers right now?

Ashwin Sreenivas: We have customers through three big channels. Number one is referrals from existing customers. Support teams will often say, “Hey, we bought this thing, it’s helping our support team,” and they’ll tell their friends at other companies. Number two is general inbound that we get because people have heard of Decagon. And three, we also have a sales team now that reaches out to people and goes to conferences.

Nataraj: Both you and your co-founder had companies before. How did the operating dynamics of the company change from your last company to now? Did access to AI tools increase the pace?

Ashwin Sreenivas: A lot of things changed. For both of our first companies, we were both first-time founders figuring things out. I think the biggest thing that changed was how driven by customer needs we were. We didn’t overthink the exact right two-year strategy or how we were going to build moats over three years. We said, the only thing we’re going to worry about now is, how do we build something that someone will pay us real money for in four weeks? That was the only problem. That simplifies things, and we learned that all the other things you can figure out over time. For instance, with competitive moats, when we sold a deal in the early days, we would ask, “Why did you buy us?” They would tell us, “This competitor didn’t have this feature we needed.” And we were like, great, so we should do more of that because clearly this is valuable.

Nataraj: It’s almost like you just listen to the market rather than putting your own thesis on it.

Ashwin Sreenivas: Yeah. I think there was a very old Marc Andreessen essay about this: good markets will pull products out of teams. The market has a need, and the market will pull the product out of you.

Nataraj: What’s your favorite AI product that you use personally?

Ashwin Sreenivas: I use a number of things. For coding co-pilots, Cursor and Supermaven are great. For background coding agents, Devin is great. I like Granola for meeting notes. I used to hate taking meeting notes, and now I just have to jot down things every now and then. I think that captures most of what I do because either I’m writing code or talking to people, and that has become 99% of my life outside of spending time with my wife.

Nataraj: Awesome. I think that’s a good note to end the conversation. Thanks, Ashwin, for coming on the show.

Ashwin Sreenivas: Yeah, great being here. Thanks for having me.

This conversation with Ashwin Sreenivas provides a masterclass in building a category-defining AI company, highlighting the power of focusing on genuine customer pain points and the massive potential for AI to create more seamless, personalized business interactions. His insights reveal a clear roadmap for how AI is moving from simple automation to becoming a core driver of customer experience.

If you enjoyed this conversation with Ashwin Sreenivas, listen to the full episode here on Spotify, Apple or YouTube.
Subscribe to our newsletter: startupproject.substack.com

November 22, 2025
Web AI Founder David Stout on Offline AI & the Edge Computing Revolution

As the artificial intelligence landscape becomes increasingly dominated by massive, cloud-based models, some innovators are looking in the opposite direction. David Stout, founder of Web AI, is a leading voice in the movement to bring advanced AI directly onto everyday devices. In a recent conversation, Stout details his journey from a farm in Michigan to pioneering the infrastructure for offline AI, a technology that prioritizes privacy, efficiency, and user ownership. Valued at over $700 million, Web AI is challenging the status quo with its vision of a decentralized “web of models”—millions of specialized AI systems working together across phones, laptops, and other hardware. This approach not only keeps sensitive data secure but also unlocks real-time AI capabilities in environments where cloud connectivity is impossible or impractical, from aircraft maintenance to personal health monitoring. Stout’s perspective offers a compelling look at a more distributed, accessible, and secure future for artificial intelligence.

→ Enjoy this conversation with David Stout, on Spotify or Apple.
→ Subscribe to our newsletter and never miss an update.

Nataraj: To set the context, could you give us a brief overview of your journey in the field of AI? When did you start working in AI or machine learning, and what was your journey like before founding Web AI?

David Stout: My background, as you mentioned, I grew up on a farm. When I was studying it, AI was very much vapor; machine learning was the actual field of study. NLP was progressing, but it was very early, even in regards to convolutional neural nets. I think this is important because my research started in a very much yet-to-be-defined space that was incredibly esoteric. There was no LLM to help you research; there were no AI tools. This was very much first-principles design.

We were looking at ways to bring convolutional networks like Darknet and YOLO to low-energy devices. At the time, these object detection or computer vision models were some of the most sophisticated and heaviest in terms of compute. They showed the most promise, in my opinion, of being truly disruptive. Having visual intelligence in spaces was going to be incredibly powerful. My research started there, and I was able to bring some of the best computer vision, object detection, and masking models to devices like iPhones and their Bionic chips.

Nataraj: And this was through your research at Stanford or at Ghost AI?

David Stout: This was through Ghost AI at the time, right around when I dropped out of school and started pursuing this full-time. We were bringing Darknet models to an iPhone. This got the attention of a lot of outside investors and technologists because it was the first of its kind. There was no TensorFlow Lite or PyTorch Lite tools that were bringing AI frameworks to devices. We wrote the whole thing from scratch, talking directly to shaders and primitives using the MPS framework on these devices. What we found, as in any moonshot, is you discover other things along the way. We realized that to bring these models to devices, we were discovering incredible compression and architecture techniques. This ultimately led to WebFrame today, which is our own in-house AI library and framework. Those early days mattered because it shaped what we ended up building. We had this desire to run models at the edge because, in computer vision specifically, if you didn’t have real-time AI processing, it was a null use case. Computer vision in the cloud is not super interesting. That’s where we started to really understand the value of AI at the edge.

Nataraj: What applications that we see today in the wild are a result of these efforts?

David Stout: The research of getting models to devices is continuing to play out; it’s not done by any means. A lot of these examples are still referencing a cloud model. Not a lot is happening on the device still. But yes, you are seeing basic object detection. A good example would be Photos on an iPhone. That’s running on-device, and you’re able to search and query basic object states or titles or names and index things. There are also modes on the iPhone in the magnifier that let you detect objects you’re looking at, and the audio kit will turn on. If you have a vision impairment, the object detector in real time will talk to you and tell you what’s in front of you. I think those are examples of some of that early work in the industry, but there’s still been a tremendous amount of focus on cloud AI. We’re seeing a lot more now in the private sector with people we’re working with where it’s multimodal, which we think is the ultimate paradigm.

Nataraj: You were working on compressing models onto devices, and in 2019 you started Web AI. What was the thesis for Web AI, and how has it changed over time?

David Stout: I started the company on three pillars. I’m a simple thinker when it comes to business and strategy; I wanted to know the utility value. I thought this cloud arbitrage is not going to work. This idea of big data and cloud compute is going to flip the whole cost structure upside down and is not super promising for AI in regards to individual ownership. It felt like we were going to copy the internet era and reproduce all the mistakes we made there. The thesis for Web AI, when we founded it in January of 2020, was: if we could bring AI to devices and run it privately in a way that a user or enterprise owns it, would people pay for that? Would people want that? If you could serve AI on a device and bring world-class intelligence and put it in someone’s pocket, is that valuable? The simple answer is yes, it’s worth pursuing.

The second question is what kind of use cases could you unlock that the alternative would be unable to do? A simple way to look at this is you have companies with IP-centric data that they can’t share with a foundational model. You have companies with regulated data they can’t share. And you have use cases that require real-time, no-latency decision-making that can’t go up to the cloud. These problems require an AI solution that lives in the environment, that they can directly engage with, and that’s state-of-the-art. That’s really the problem we were solving.

Nataraj: General audiences often think in terms of large models, especially post-ChatGPT. But before that era, it was all specialized models. When you identified these factors, what was your initial approach to productizing this? How did you focus, because the field is so wide?

David Stout: It is very wide. Actually, our strategy was the inverse. We said we need to be as horizontal as possible. We need to own the tooling, the methodologies, the frameworks, the communications. We don’t need to own the model. We want to be the pickax and shovel of an industry rather than be the best medical model company, and that’s all we do. The reason for that, and I think it’s played out quite like we thought it would, is that so many VCs told me, ‘You guys have great technology, you should just focus on one industry.’ We disagreed for the fundamental reason that we’re seeing now: if you’re not horizontal as a tech stack, you’ll get steamrolled by these incredibly smart, powerful foundational model companies. If you’re building an app focused on coding, I still think you’re at great risk of just getting steamrolled. I just don’t see how those companies have long-term staying power when the model that they rely so heavily on is not theirs. We decided to focus on the tools that made the models great, the way to retool these models so they could run anywhere, the connective tissue that lets the model talk to another model, to a device, to a person. And we will enable our customers to interact with their data with these models and make them better. That’s our staying power. We support everything from vision models to multimodal models across the ecosystem, with the idea that the platform is designed to be horizontal and not a point solution.

Nataraj: What type of customers use your product today? Can you give a couple of use cases to crystallize where Web AI plays a role?

David Stout: We work in industries where there’s highly contextual data that is not on the internet. It’s not on Reddit, whether it’s working on an airplane engine or with individual personal health data. It’s data that does not exist on the web that needs to be navigated, trained on, and personalized for each of these users to drive real results. We work with the Oura Ring, if you’re familiar with them. Additionally, we are working with major airline companies and aircraft manufacturers to improve maintenance as well as assembly. And outside of that, we are working with the public sector on all sorts of use cases that require AI to work anywhere, not just in a data center stateside. The ubiquitous connective tissue in all of our customers is they have data that no one else has. They operate in a privacy-mission-critical environment where data cannot go somewhere else, it needs to be highly accurate, highly performative, and it needs to operate at the edge.

Nataraj: I haven’t seen a lightweight personal model that exists on my machine yet. Is that model not possible, or why haven’t we seen that kind of experiment from any company?

David Stout: I think we haven’t seen it because the models that are easy to ship to devices are bad. People have become accustomed to a certain performance and intelligence capability. Web AI is actually in October releasing what you’re describing. We’re releasing our first-ever B2C solution, which is that: download it, run it on your machine, run it on your phone. Why it’s taken us time is we had to make some architecture changes so we have a great model that’s performative, that’s not disappointing, and that runs and lives on your phone. That’s a hard problem to solve. It’s always been easier to just have the cloud do it. I think a lot of companies are hitting the easy button on this one and just using the cloud. It works from a functioning perspective, it absolutely works; it’s just astronomically expensive and inefficient. The AI companies that are popular today are really focused on trying to solve the super-intelligence problem rather than solving the actual unit economics, monetization, and privacy problems. These tools will be valuable for users because now they have something that’s private that they own. It lives on their device, it’s personalized, and it’s ultimately safe.

Nataraj: There’s so much spending going on in data centers, rationalized by the argument that this will lead to something that looks like AGI. What are your thoughts on the trajectory of these foundational model companies and AGI?

David Stout: If we’re fairly pragmatic about what we’ve seen, there’s this common consensus that it will keep scaling, models will get better, and we’re going to steamroll everyone with the best model. My problem with that is the empirical evidence we have right now doesn’t say that. Pre-training, in all senses, is pretty much proven to be flattening. GPT-5 is an MoE, a model router. It’s a lot of post-model work. Most of the gains we’re seeing are post-training. For the last several iterations of these models, the majority of the advancement is happening post-training, which would indicate that we are hitting a plateau on this idea of training continuing to scale. I think we’re tremendously overbuilding. We have an energy problem, a water problem. It’s so early. I’m not a big believer in the long tail of the transformer architecture. To build all these data centers when we don’t even know the architecture… it’s questionable. For me, what makes the most sense is this idea that civilization is the only example of super-intelligence. You have groups of people with different contexts, talents, and abilities that build incredible things. We don’t have any example of singular super-intelligence. What I would say is much more likely is we see super-intelligence come out of millions and billions of contextual models that are living across the world as a compute dust that’s everywhere. That statement is far less risky than the one we’re talking about in parallel, which is, ‘I’m going to figure out a way to train this one model, it’s going to solve everything, and it’s going to be AGI.’ The civilization approach is not only theoretically accurate, but nature and science have demonstrated it to be true.

Nataraj: I was watching your talk where you had this very interesting line: ‘Prompts don’t pay bills.’ Can you elaborate on that?

David Stout: These companies have created bad habits. Prompting is horrible for their business model. They need to be proactive; they want to get prompting out of their business. Every question costs them money. It’s not the same model as internet companies, where a user coming to your website is a dollar sign. With OpenAI, when you log in and ask a question, you’re cutting into their profits. That’s a challenging business to be in. The philosophy of ‘prompts don’t pay the bills’ is about how we create AI interactions that are precognitive, working on behalf of the user so the user doesn’t have to ask another question. This supports the distributed model architecture as well. When you create an AI application on a foundational model, you use a system prompt to tell the model how to behave. Fundamentally, you’re telling the model to be smaller. You’re saying, ‘Be a doctor, answer this way, don’t talk about race cars.’ What Web AI would say is you just want a doctor model. And the doctor model is going to be far better than a system prompt model pretending to be a doctor. That’s how you get to super-intelligence: you have millions of models that are category-leading. They aren’t prompted to behave a certain way; they just *are* a certain way. This is why the internet beat mainframes.

Nataraj: What do you think about what XAI is doing? I feel like they are result-maxing for the leaderboards, but I don’t see XAI being used much in real applications.

David Stout: I think everyone trains towards leaderboards. You’ve seen the party games where people wear a sign on their head and have to guess who they are. AI is doing the same thing with a benchmark. When you train around a benchmark, you eventually realize what the benchmark is. That’s all that’s happening. A really interesting example we saw personally: we trained on open-source F-18 data for the Navy and ran a retrieval task against it. We got about 85-90% accuracy on a really complex maintenance manual. We did the same exercise with GPT-5, and it was 15% less accurate than our Web AI system. What was interesting is on the open QA benchmark, OpenAI was only seven points lower than us. So on the leaderboard, it seemed like we were far closer in performance, but in practical application, the delta is always a little bit bigger. I think the leaderboard is a little irrelevant to what’s actually happening.

Nataraj: We are almost at the end of our conversation. Where can our audience find you and learn more about Web AI?

David Stout: I’m on Twitter, @DavidStout. We’ve got a lot of new announcements coming out. We just released two new, really significant papers. We’ll be sharing more in our fall release, with several new products that will be available for users the day of the announcement. You can get more information on our website and on social media. I’m really thankful for the opportunity to come on and talk and learn from you.

David Stout’s insights offer a compelling vision for a future where AI is not a monolithic entity in the cloud, but a distributed, personalized, and private tool running on our own devices. This conversation highlights the practical and philosophical shift towards an accessible and secure AI ecosystem.

→ If you enjoyed this conversation with David Stout, listen to the full episode here on Spotify or Apple.
→ Subscribe to our Newsletter and never miss an update.

October 13, 2025
Fixing Broken Meetings & Managing Calendars with AI | Matt Martin

In an era where back-to-back meetings and fragmented schedules are the norm, how can teams reclaim focus time and achieve deep work? Matt Martin, co-founder and CEO of Clockwise, is tackling this problem head-on with an AI-powered calendar assistant designed to create smarter schedules. In this conversation with Nataraj, Matt delves into the complexities of modern work, from the “maker versus manager” schedule conflict to the surge in meetings post-pandemic. He offers his perspective on the evolving SaaS landscape, the real-world impact of AI agents, and why many new tools feel half-baked. Matt also provides a look inside Clockwise, explaining how they leverage AI to not only optimize individual calendars but to orchestrate entire organizational workflows, ultimately giving teams back their most valuable asset: time. This discussion is essential for anyone interested in the future of work, productivity, and the practical application of AI.

→ Enjoy this conversation with Matt Martin, on Spotify or Apple.

→ Subscribe to our newsletter and never miss an update.

Nataraj: To get started, can you describe to the audience what Clockwise is and how your customers use it?

Matt Martin: At its core, Clockwise is a very advanced scheduling brain. We connect to your calendar, whether that’s Google Calendar or Outlook, and you can use it as an individual. We start to analyze your calendar when you connect it, understanding the cadence of your meetings, when you tend to work, your working hours, and when you like to take breaks. We ask you a few questions to get to know you a little bit better. Based on that information, we start giving you suggestions on how to optimize your schedule for more time for high-impact work. Where Clockwise really hits its groove is when you start to use it among a larger group of people. Clockwise can look at the interconnection between you, other attendees, their preferences, and how to optimize calendars holistically. We do this at scale for some of the best companies in the world, like Netflix, Uber, and Atlassian, where we help optimize schedules for almost the whole company or complete engineering departments to give more time for high-impact work, meet with the right people, and have a sane work life.

Nataraj: You are living in this world of calendars and meetings. This reminds me of an instance a couple of years back when Shopify CEO Tobi sent out a memo saying you can cancel any meeting you want and we want to reduce the number of meetings happening in our organization. What is your general take on the frequency or number of meetings happening in a company? What trends are you seeing in how companies are optimizing their meetings?

Matt Martin: In a lot of ways, Clockwise goes all the way back to a famous article by Paul Graham called “Maker’s Schedule, Manager’s Schedule.” The reflection in his article was that, often inside software engineering organizations, the two modes of operation conflict. The managers control the schedules because they’re setting the cadence of meetings—syncs, standups, one-on-ones, team meetings—and they get a lot of their productivity done in meetings. Whereas for makers, people like software engineers and designers, they need large chunks of time to go heads-down on a project and get in flow to be able to tackle things. The first thing I would observe is that different people have different demands on their schedule, so there’s not really a one-size-fits-all here. I love Tobi’s memo because I think it’s always a good idea to clean out the cruft on a regular cadence and reset the baseline because things build up over time. But I would also caution that meetings aren’t inherently bad; it’s just another way of collaborating with peers and making sure you can get your work done. The question is, what are you trying to accomplish and who is the audience for it?

There are some almost gravitational forces when it comes to meetings. One is that we’ve seen in our data that the larger the company gets, the higher percentage of time people tend to spend in meetings. As you have more people in your orbit, the cost of collaboration and coordination goes up. Another thing that happened is when COVID hit, the quantity of meetings spiked way up because as people went remote and hybrid, they were trying to figure out how to replace a lot of the content of an in-office environment with meetings. That subdued a little bit, but it never came all the way back down. There’s an overhang from companies going remote, and even today, you do see some split between in-office companies and remote companies in terms of volume of meetings.

Nataraj: Is there any interesting trend? One of the things that happened after COVID, for me at least, is an increase in non-scheduled meetings. You just have a question and you all get on a call spontaneously, sort of replicating the hallway chat remotely. Do you have any statistics on a spike in those and how they’re doing right now?

Matt Martin: That tends to be one of the sources of the split between remote and in-office because when you’re in the office, those conversations still happen, they just don’t get recorded formally on the calendar. If you’re remote, you do have to reach out. There are informal ways to do that, like a quick Slack huddle, or you could move some of that to asynchronous conversation, which is a good pattern. But one of the phenomena is just that there’s a shift in the medium. Instead of bumping into someone in the hallway or going over to their desk, you have to find a Zoom meeting or schedule something on Google Meet. The frequency goes up. The amount of time spent in synchronous conversation, however, doesn’t actually vary as much with remote or in-office because it’s just a different type of synchronous conversation. It depends a lot on the culture. At a place like Apple, where it’s not uncommon for software engineers to have their own dedicated private offices, that sort of synchronous conversation in the office is much lower than a place that’s a wide-open office environment.

Nataraj: Clockwise started before ChatGPT and all the LLM mania started. It feels to me that there’s now this rethinking in organizations about what type of tools we are adopting. A typical thousand-person organization might have 100 to 200 SaaS products. We are seeing a shift in how many products you’re adopting, and there’s also an accelerated pace of launching new features. Do you see this happening in your perception of how sales are going for your product or other products when you’re talking to other founders or customers? Is this a change in narrative, or is it more of a narrative than it’s real?

Matt Martin: It’s interesting that you bring up Zoom in the context of AI tooling and acceleration of feature adoption because I think there’s a more significant undercurrent that’s not related to AI, which is the correction a couple of years ago from a zero-interest-rate environment to an environment where money isn’t free. That had a significant impact on SaaS buying, renewal, and adoption cycles, especially among more mature organizations. We saw a huge wave of consolidation, removal, and re-evaluation of tools that we hadn’t seen in the lifecycle of our business before. I think Zoom’s proliferation of product development is downstream of that consolidation effort, not AI. They saw that if you’re just video conferencing software, it’s easier to rip you out. Everybody pays for Microsoft 365 or Google for basic email and calendar, and both come with video conferencing. So Zoom is trying to replace that office suite. It remains to be seen if they can be successful, but I think that’s the more significant trend.

When it comes to AI tools and adoption, that has been a bit of a resurgence and a correction in that downturn in buying. There’s definitely been top-down appetite to find ways to add to the productivity and capacity of the organization with those tools. I will say, however, the trying and retention is way different. I’m quite proud of Clockwise’s retention; people use it and they like it. But as I’ve talked to IT leaders and CISOs, there’s a lot of experimentation, but there’s a lot of churn. A lot of these AI tools look interesting at the outset, but it’s hard to measure what they’re contributing to the bottom line. It’s an interesting mindset where you have this massive constriction in what people are willing to spend for software, but then a real increase in experimentation. Some of that conservatism in terms of what they’re actually buying is still there.

Nataraj: I ask because there’s also this hype around what an AI agent can do. Every new AI agent platform offers things like optimizing your calendar or increasing productivity. The problem I see is the form factor isn’t fitting the promise. When you get into things like revenue management, where a CIO wants to see the number, it’s not yet easily correlated, especially in these agentic, chat-based form factors. Could you talk a little bit about that disconnect between what AI agents are promising and why that disconnect is there?

Matt Martin: A lot of this is the basics of software selling that have been around for a while. Ultimately, the buyer needs to see the case for the return on investment. The reason there’s so much hype around AI is that people have seen the impact it can have in various facets of their job, so they’re clamoring to find other areas for that application. But to your point, if there’s revenue acceleration that the CRO isn’t actually seeing, they’re not going to buy the software, whether it’s an agent or a piece of SaaS. In many of these areas, the efficiency gains are notoriously difficult to measure. Clockwise undeniably helps people be more productive, but our ROI measurement problem has always been there. We’re productivity software. We can tell you about all the dedicated time we put back in schedules, which to some extent is a measurable hard ROI, but some buyers look at that and ask, “Okay, you made their schedule more flexible, but did they actually get more done?”

There are interesting new pricing models being experimented with. You see places like Sierra doing outcome-based pricing; for each ticket they take off a customer service person’s desk, that’s what you’re paying for. That’s much closer to hard ROI because you’re offsetting real employee time and salary in a concrete way. I think it’s difficult to find those measurables often, though. It’s difficult to find that hard translation of outcome and to have accountability all the way back. People are experimenting, and it’ll be interesting to see where it lands, but a lot of these problems echo through software sales since the 70s.

Nataraj: How are you leveraging AI in terms of creating new features and products? Can you give examples of how you’re using AI within Clockwise as a product?

Matt Martin: I’ll answer in two ways. First is operationally, how we are developing product. The second is how Clockwise as technology actually uses AI. On the first point, we have a truly AI-native product development cycle where people are utilizing tools at every stage to accelerate results. One of the clearest points of leverage for me is the collapsing of product research and prototyping. I have designers who are literally spinning up their own interactive prototypes, whether in Figma, v0, or Lovable, and putting them in front of somebody. Previously, that was quite costly. Now you can do it quickly without worrying about bugs. That accelerates development cycles. With all the tooling we have, you can spin up a lot of paths and experiment with the best one because you can get there faster. It still requires a lot of human review, or you’ll create a really hairy code base, but you can really accelerate your experimentation cycles.

On the Clockwise front, what are we actually doing with AI? There are multiple levels. One is we have a product in the field right now that allows AI-based scheduling. You can chat with Clockwise and say, “Hey, I want to schedule a time with Nikita, Aaron, and Joe next week.” We have our own fine-tuned model that we fine-tune to pay attention to time and time-based requests. It can parse the user’s intent and give it to our back-end systems to conduct the scheduling. We’re also about to launch our own MCP server that connects up our scheduling engine to frontier models or whatever MCP client you might be using. It’s been fascinating to see, especially with MCP, the combinatorial power of having different tools that can be called into from a pretty intelligent base model.

Nataraj: You mentioned becoming babysitters for half-baked tools in a LinkedIn post. What trend are you seeing? Why are a lot of these tools looking half-baked?

Matt Martin: I’m so energized by what’s happening in the industry right now because I love experimentation. When you have an explosion of new technology, it’s exciting. But with that explosion comes chaos. People are trying out new things and trying to connect them. When you look at LLMs, the ability to call into other tools is an obvious need. Anthropic developed MCP, and it’s an interesting and elegant first attempt, but it’s cumbersome. It is not for my mom. The more tools you add, the slower the LLM gets, the more complicated it gets. It’s clearly not a pattern that will extend into infinity, but it is a jumpstart on experimentation. So I think we’re in this early phase where getting a workflow completed in an AI-based way is often more cumbersome than just using a pre-existing piece of software. Some skeptics look at that and say this is all BS, just a more complicated way to do things we already know how to do. But the ability of the base model to intelligently reason and navigate these workflows is transformational. We just haven’t gotten there with the interface, with how we put those workflows together, or with the accessibility and usability for the average user.

Nataraj: My take has always been that it’s an evolution. First, we saw the base models, and then a bunch of engineers built V0 versions of everything. Now you really need product thinkers who understand the market and use cases to build the next generation of products. We are still early in terms of the apps leveraging AI. There’s an opportunity to rethink fundamental apps. Can you rewrite Outlook with AI being first? Notion rethought how a note-taking tool should be for the internet. Can you rethink even Notion with AI in place instead of added on top? There’s a lot more experimentation to come.

Matt Martin: I agree with that. We’re definitely in the phase where there’s a lot of bolt-on. There’s a lot of looking at current products and asking, now that I have this additional technology, what can I do on top of this product to augment it? The note-taking example is interesting. Notion has added on its own AI product. It’s one of the more interesting ones I’ve seen, but the frequency with which I use Notion’s AI features versus Notion as just a note-taking tool is maybe 100 to 1. In the future, note-taking probably looks more like something that is an omniscient collection of information that you can query and talk to about surfacing the right information at the right time. Most technology is additive. When we got smartphones, we didn’t get rid of laptops. There’s going to be an evolution where a completely new category and feel of software emerges from AI. Right now, outside of the frontier models like ChatGPT and Claude, I haven’t seen that many things that genuinely feel very new instead of augmentative.

Nataraj: I think we’re almost at the end of our time. What are the best ways for our audience to discover you and the work you are doing?

Matt Martin: The first place to go is clockwise.ai or getclockwise.com. You can start with Clockwise today; it takes about 30 seconds to get up and running. It’s amazing. You’ll get time back in your day, and it’s free to start. If you want to get into contact with me personally, I’m always happy to connect. LinkedIn is actually where I post the most. You can find me, Matt Martin, at Clockwise. On basically any social media, I’m /voxmatt, V-O-X-M-A-T-T. You can find me on Mastodon, which I tend to post to a little bit. I’m a little bit on Threads, a little bit on Blue Sky, a little bit on X. The fracturing social ecosystem has not done well for me in terms of one channel, but LinkedIn’s probably the most consistent.

Nataraj: This was a very fun conversation. Excited to see what Clockwise does next. Thanks for coming on the show.

Matt Martin: Thank you very much. This was a lot of fun.

Matt Martin’s insights reveal a clear vision for a future where AI doesn’t just assist but actively manages our schedules to enhance productivity and well-being. This conversation is a must-listen for anyone looking to reclaim their time and understand the practical applications of AI in the modern workplace.

→ If you enjoyed this conversation with Matt Martin, listen to the full episode here on Spotify or Apple.

→ Subscribe to our Newsletter and never miss an update.

September 29, 2025
Apollo GraphQL CEO on APIs as Graphs, Not Endpoints | Matt DeBergalis

Introduction

In the world of modern software development, managing the flow of data between services and applications is one of the biggest challenges. Matt DeBergalis, co-founder and CEO of Apollo GraphQL, has been at the forefront of solving this problem. His journey began with the Meteor framework, which revealed a critical need for a more principled way to handle data fetching. This led to the adoption of GraphQL, a query language that treats APIs not as a collection of disparate endpoints, but as a unified, connected graph.

In this discussion, Matt joins Nataraj to explore the evolution of Apollo GraphQL from an open-source project into an enterprise-grade platform. He breaks down the unique value of GraphQL for developers, the strategic decisions behind building a commercial product around it, and the complex trade-offs in today’s full-stack architecture. He also offers a compelling look at how AI is amplifying the need for robust API strategies, making technologies like GraphQL more relevant than ever.

→ Enjoy this conversation with Matt DeBergalis, on Spotify, Apple, or YouTube.
→ Subscribe to our newsletter and never miss an update.

Conversation Transcript

Nataraj: You’re now CEO of Apollo GraphQL. Can you give us a two-minute history of your journey until now?

Matt DeBergalis: We started the company with Meteor, which was a JavaScript development framework from the era of the first true browser-based apps. When you build software that way, you need a principled story for how you move data from the cloud into the application.

Matt DeBergalis: GraphQL and Apollo are at the heart of that story. While building Meteor, we found that the piece of the stack that brokers the flow of data from underlying databases, APIs, and all the systems that feed your software up to the app is where there’s a ton of complexity. It also accounts for a huge fraction of the handwritten code that makes building good software take so long. GraphQL is a wonderful, declarative language, so you can build infrastructure around it. We see this happening all over the stack—Kubernetes and React are examples. Apollo is that for your APIs. It’s about replacing all the single-purpose, bespoke code you might write with a piece of infrastructure and a principled programming model.

Matt DeBergalis: The name GraphQL hints at what makes it wonderful: we treat your systems, data, and services not as individual endpoints you call imperatively, but as a connected graph of objects. That completely changes the development experience. It makes it possible to express complex combinations of data in a simple, deterministic way. There’s a query planner in there, so you can do all kinds of transformations and other things necessary to build software in a repeatable, understandable way. We’ve found that this dramatically helps companies, especially larger ones with lots of APIs, accelerate how fast they can build good software.

Nataraj: GraphQL was initially open source while you were working on Meteor. At what point did you realize there was enough opportunity to make this a new company?

Matt DeBergalis: A couple of things were happening. First, the original version of Meteor was based on MongoDB, so we received many requests to support other databases, data sources, and REST APIs. The Meteor development experience was almost like black magic; you would write a MongoDB query directly in your client code. Meteor would run that same query on the Mongo server in the cloud and synchronize the results across the wire. This infrastructure efficiently kept them in sync in real time. The consequence was that Meteor apps were real-time. You’d write a query, connect it to a component on your screen, and as the database changed, the screen would automatically update. That was amazing, especially in those days, but it had to be in Mongo. So we needed a more general query language.

Matt DeBergalis: Just as we needed that, Facebook announced GraphQL. This brings me to the other part of the answer, and why I think GraphQL has flourished where similar ideas haven’t. In my view, GraphQL is the first API technology that really asks about the needs of the clients—the consumers of data—instead of the providers. REST APIs, or older technologies like ONC RPC or SOAP, are all defined by the server. The consumer gets what they get—the payload, the format, the protocol. This puts a huge burden on the application developer because it’s rare that what comes back from the API is exactly what you need. You might need to filter it, transform it, or join it with another API.

Matt DeBergalis: GraphQL has an incredibly good developer experience for the consumer. You write a strongly-typed query, which means great tooling support in your editor. Now there’s great tooling support in agentic editors because it’s semantic and self-documenting. A lot of what makes a technology win is the feeling a developer gets when they try it—how easy it is to use and how quickly you can get to something good. GraphQL had that delightful characteristic. It came from the same team at Facebook that created React, so it had a lot of energy as web development moved into the modern era. Those two things made it an easy choice for us.

Nataraj: It’s interesting that so many developer technologies came out of Meta that they didn’t monetize. Unlike Amazon or Microsoft, they seem to define themselves strictly as a social media company. Why hasn’t Meta become a fourth cloud provider? They have the technology, developer experience, and money to do it.

Matt DeBergalis: Here’s one quick point on that. A lot of the energy around React, for example, really came out of recruiting. Many companies do this. Open source was a tool for driving an engineering brand, especially in an era when it became very difficult to hire. There was a war for application development talent. So, one reason to open source something like React, even without a direct business case—it’s hard to monetize React, Vercel is probably the best example and it’s a tenuous connection—is that if it helps you recruit, you can justify a lot.

Nataraj: That’s a very important point. It likely explains Microsoft’s strategy shift to becoming one of the biggest open-source contributors. So, we have these open-source products that find developer love, and then a company forms around them, like Databricks with Apache Spark. What was the journey like for GraphQL, taking a great open-source product and turning it into a business with a product worth paying for?

Matt DeBergalis: One surprising thing is that open source on its own often doesn’t get a ton of adoption, especially when it’s designed for a larger company’s needs. Take Databricks and Spark. One way to look at it is that they built the company because people weren’t adopting Spark. Why not? Because it was hard. It’s a complicated piece of machinery. The company that needs that problem solved needs more than just Spark. The best vehicle for solving those kinds of problems is a business because it allows you to create a whole product, a solution. The enterprise sales process is really about helping the customer navigate the decision. The monetary cost is one thing, but the much bigger cost is the complex, multi-stakeholder architectural decision.

Matt DeBergalis: With GraphQL, we asked a simple question: How do you get something like this adopted? I can give you a compelling technical reason why having a graph and writing a query is better than writing a bunch of procedural code for every new application experience. But in practice, how does that get adopted? If you pull that thread, interesting things emerge. Who owns APIs in an enterprise? Who makes architectural decisions? How do you balance the executive who owns the roadmap with engineering needs? The VP of engineering job is maybe the hardest job today. You’re under enormous pressure to ship quickly. If you ship slower than your competitor, it could be the end of your company’s viability.

Matt DeBergalis: At the same time, you can’t mortgage the future. If you race to ship a product but create a big security vulnerability, you’ll get fired. If you build a product and then discover Amazon shipped Alexa and you need a voice app, or OpenAI shipped GPT and you need an agent, you’re in trouble if you’ve painted yourself into an architectural corner. You’re caught between a rock and a hard place. The consequence is you’re going to want help—more than just a raw piece of technology. You’ll want a plan, end-to-end integration, and all the ‘ilities’: observability, security, auditability. That, for infrastructure at least, is the heart of how you go from an exciting open-source project to something that makes business sense and can be adopted at scale.

Nataraj: When a business adopts an open-source technology, they’re not just adopting a product; they’re adopting a certain level of risk. That’s why you have legal agreements about things like data privacy issues.

Matt DeBergalis: That’s right. And the biggest risk by far is picking the wrong technology. Think about the cost of getting that wrong. If you adopt a database and five years later it has no users, the open-source project is on life support, and there’s no vendor to help you, you’re in real trouble. You’re facing a migration.

Nataraj: The problem with Meta open-sourcing Llama is that if I’m building on top of a model, I need someone to host it and guarantee 99.99% reliability. There are all these dynamics between open and closed source.

Matt DeBergalis: You see this across the stack. Maybe 10 or 20 years ago, a developer would start by asking what’s open versus closed to avoid vendor lock-in, especially after experiences with companies like Oracle. Now, it’s a little different. There’s so much to buy across the stack that you don’t have time for a deep analysis of everything. The biggest risk is getting it wrong. With AI moving so fast, you see a preference for what’s prevalent. You’re probably going to be in good shape if you go with the market leader. That means you can hire people who know the technology, and there’s a good chance it will mature quickly enough to meet your future needs. It’s a virtuous cycle. That’s the pitch I make for GraphQL: you have an API strategy to decide on, and you should start from the premise that picking the one developers like, with a vibrant user community, is a safe choice.

Nataraj: I have a view on the evolution of full-stack development and wanted your thoughts. Are we making it more or less complicated? I feel like we’re making things more complicated to stand up a scalable web application.

Matt DeBergalis: I grew up writing software on a Commodore 64. It’s definitely gotten more complicated, and it’s all across the stack. Microservices make sense for scaling engineering efforts, but they drive the need for things to manage those processes, like Kubernetes. You need a way to orchestrate API calls across them, which is our GraphQL story. Does Apollo add complexity? There’s an argument that it does. On the other hand, each of these layers, when done right, adds value and lets you go faster. A good architecture should have the property where the more you build, the more valuable the whole thing becomes. Some technologies feel like the opposite; you keep putting energy in, but it slows you down.

Nataraj: It feels like there’s a lot of distance before you see that geometric growth. The early investment is high, and you’re always thinking two or three years down the line. You can’t start something fast because you’re planning for the future, like choosing the right database.

Matt DeBergalis: It’s gnarly because many technology decisions boil down to: do I want a quick result today, creating debt for tomorrow, or do I want to set myself up for a bright future? It’s a hard call. Our job is to try to square that circle. Kubernetes was really complicated for a long time, but it has gotten easier to use. Now, we see all the benefits. That’s been a big priority for us at Apollo. The knock on GraphQL is there’s a lot of upfront setup work to build a schema—the catalog of all your APIs. Once you’ve done that, it’s wonderful, but getting there is a pain. Much of our roadmap over the last year has been about making it really easy and fast to get to that point of value. In 2025, most people will choose to solve today’s problem and worry about tomorrow later.

Nataraj: For a small company with just one or two APIs, at what point does adopting GraphQL make sense? What type of customers do you have today?

Matt DeBergalis: It makes sense from the client’s point of view. When you’re using GraphQL and React, you just write a query in your component, and that’s it. From the API side, with one or two REST APIs, it’s not a big deal. You can easily change them. But for a company with 10,000 REST APIs, it becomes very difficult to change them because you have no way of knowing what fields are being used. The thing that’s really interesting now is agents. Everybody wants to have some kind of agentic experience on top of their APIs, and GraphQL is a fantastic fit. GraphQL is an orchestration language. It’s about transforming, changing protocols, filtering. We’re excited about agents because they put those needs front and center.

Nataraj: Can you talk about the size and scale of Apollo today as a business?

Matt DeBergalis: GraphQL is used in about half of the enterprise world. We see this because we provide most of the standard open-source libraries. Commercially, we’ve focused on larger companies because graphs have a network effect; they are most valuable when they’re large. For example, we have a lot of retailers. Think about an online store. On a product page, you want to show the customer, ‘Arrives on Friday.’ To do that, you need to make a ton of API calls under the hood—inventory APIs, shipping partner APIs, loyalty APIs. It’s the kind of thing that seems trivial from a user experience perspective but explains why it often takes months to ship. Larger, established companies in retail or media, like The New York Times, find GraphQL incredibly valuable. Also, companies that have grown through M&A and have heterogeneous systems need to bring products together into a single user experience. That’s where GraphQL is historically adopted most. But again, agents are changing that, creating excitement around GraphQL at companies of all sizes.

Nataraj: How do you acquire customers and market to new developers to ensure you remain the go-to product?

Matt DeBergalis: For most open-source companies, including us, open source is an important part of the funnel. Development teams make the technical decisions. It starts with a developer who reaches for something they’ve heard of that makes sense and that they can try quickly. Open source is a great vehicle for that. It starts with a React developer reaching for Apollo Client or an MCP developer wanting to define an agent declaratively. You can build from there, but that open-source entry point and the content around it are by far the most important things.

Nataraj: You recently transitioned from CTO to CEO. How has your focus changed?

Matt DeBergalis: It changes a lot and changes nothing, depending on how you look at it. I’ve always felt our customers are where everything comes from. The first version of Apollo Client was really bad, but it had one killer feature: we took pull requests. People would adopt it, and that turned into wonderful partnerships, like with The New York Times. It was a formative technical partnership that became a customer partnership. For me, the heart of it has always been partnership. I don’t see that differently from the CEO seat. Startups that do well all say users and customers come first. I always spent a lot of time with our sales team and on marketing. That hasn’t changed. And I still do product demos every Friday. I don’t ever want to lose touch with that. If we don’t have a great product that developers love, the rest doesn’t matter.

Nataraj: What does the sales motion for Apollo look like? Who are you typically talking to?

Matt DeBergalis: It varies. You can look at Apollo from two different perspectives. One is the team trying to ship a piece of software. For them, Apollo helps them ship faster and with less risk. That’s who you want to sell to because that’s where the value is. They own roadmaps. The other lens is seeing Apollo as a platform. You have platform engineering teams that own developer productivity or operational excellence outcomes. Ideally, we initially bring Apollo to a team with a specific use case. If you’re selling to someone for whom the value isn’t immediate, it’s much harder. You start there, but you think about eventual expansion. When you think about expansion, a technology like this will naturally find a place in a platform organization’s portfolio, so you want to meet those people early on.

Nataraj: Let’s talk about AI. What are your thoughts on the current hype cycle, and what is the opportunity for Apollo in AI?

Matt DeBergalis: I’ve never seen anything like it. It’s so disruptive. The immediate thing for us is that AI will drive a lot more API consumption. Every company is scrambling to build some kind of agentic user experience. The line between an agent and an app will get blurry. For example, a bank has one mobile app for a wide range of customers, from retirees to college students. With agents, maybe that one app can serve both by learning what I’m into and adapting the user interface. That’s a really different kind of app, and every one of those efforts will drive a whole bunch of net new API calls. That’s good for Apollo and GraphQL.

Matt DeBergalis: Also, the nature of those API calls is different. You can’t trust a non-deterministic model, so the API layer needs more access control and policy enforcement. GraphQL is a nice fit here because it’s semantic. Token management becomes really important. The best way to keep a model on track is to not feed it tokens it doesn’t need, which also saves money. That sounds like GraphQL—only the fields I want. So, we’re seeing a lot of demand for GraphQL because of agents. The way we build software is also changing. Every part of our company is changing. If we’re going to be an essential part of the AI-first stack, we have to be an AI-first company.

Nataraj: Has AI changed the business metrics for your company? Has it helped save costs or optimized capital expenses?

Matt DeBergalis: AI definitely accelerates some things. Sometimes you use that to get more done, and sometimes to do it more cost-effectively. We’ve seen both. It’s also not magic; we’re still figuring out what it can and can’t do. Personally, AI helps me do a lot of things faster and better. I use it a lot for research, and I feel more informed than I was a year ago. I don’t have a research team at my disposal. I don’t think of it as changing how we hire as much as having more relevant information at my fingertips, which helps us make better, faster decisions.

Nataraj: It almost feels like AI is overestimated in the short term and underestimated in the long term.

Matt DeBergalis: So much is like that. I think that has to be true.

Conclusion

Matt DeBergalis provides a masterclass in identifying a core developer need and building a powerful platform to solve it. His insights into GraphQL’s client-first approach and its growing importance in an AI-driven world offer a clear vision for the future of API architecture.

→ If you enjoyed this conversation with Matt DeBergalis, listen to the full episode here on Spotify, Apple, or YouTube.
→ Subscribe to our Newsletter and never miss an update.

August 25, 2025
Box CTO on Enterprise AI: Unstructured Data & AI-First Strategy

How are large enterprises navigating the seismic shift to artificial intelligence? For many, the journey begins with managing the 90% of their data that is unstructured—documents, images, videos, and contracts. In this conversation, Nataraj sits down with Ben Kus, Chief Technology Officer at Box, to explore the real-world challenges and opportunities of becoming an AI-first company. Ben shares critical insights from Box’s own transformation, detailing how they leverage generative AI to unlock value from an exabyte of customer data. They discuss the evolution from specialized machine learning models to powerful general-purpose AI, the practicalities of managing AI costs, and the essential steps to ensure data security and customer trust. This discussion moves beyond the hype to provide a clear-eyed view of enterprise AI adoption, from initial use cases like RAG and data extraction to the future of complex, agentic systems that can perform deep research and automate sophisticated workflows.

→ Enjoy this conversation with Ben Kus, on Spotify, Apple, or YouTube.

→ Subscribe to ournewsletter and never miss an update.

Nataraj: I was really excited because I work in unstructured data as well and I realize how important it is. But let’s set a little bit of context for the audience. In the storage industry, it’s a common phrase to use unstructured data. But it would be good to set the context of what is unstructured data and why Box is in the center of all things AI.

Ben Kus: It’s interesting. Oftentimes, if you say the word data to anyone, especially computer scientists or people who have come from programming backgrounds, you naturally think of structured data. We want to become more data-oriented; we need to use data. And it’s partially because there’s been a massive data revolution over the last 10 or 20 years. It used to be that my data was in a MySQL database somewhere. Then it became more tools available where you would use terms like data lake and data warehouse, more advanced analytics tools. You see companies like Databricks and Snowflake that become these very powerful platforms of structured data. That’s just naturally what you think of.

Now, the world of unstructured data, which I would define as data that’s not in a database and doesn’t have a schema to it—things like emails, messages, and webpages. In our world at Box, it’s the world of what we call content or files, the stuff that goes into documents, PowerPoints, markdown files, videos, or images. All of this is unstructured data. Interestingly, almost every company you talk to, in a business-to-business, enterprise-oriented thought process, 90% or more of their data is actually unstructured. At Box, we have 120,000 enterprise customers, we have over an exabyte of data, and this is what we’ve always lived by. You need to collaborate on it, you need to sync it to get it to different places.

But then generative AI comes around, and generative AI is born on unstructured data. So it naturally, immediately, every company I’ve ever talked to, if you ask why they’re interested in generative AI, one of their top three things they’ll say will be, ‘Well, I’ve got all this internal stuff in my company that is unstructured data, and I don’t think I’m taking advantage of it enough.’ It takes a million different forms, and it’s partly why it’s been hard to really automate or make specialized applications to deal with these types of data. But there’s this huge untapped potential in unstructured data. So for Box, with all of these new models coming out from all these great providers, it’s a gift to companies and to people who think the way that we do, which is how can you get more out of your unstructured data? Now AI can basically understand unstructured data. For the first time, you have this automated ability to have computers be able to understand, watch, read, and look at these things and then be able to not only generate new content for you but also to understand and help you with the content that you already have, which in many companies is massive—petabytes, hundreds of billions of these pieces of content that in some cases are the most critical stuff they have.

Nataraj: Unstructured data includes Box, Amazon S3 files, Azure has Blob, and any given enterprise has multiple places where they’re storing data. In terms of your strategy for building products, how much are you thinking about extending the Box ecosystem into all these surface areas versus building tools or products within the ecosystem? Talk a little bit about your strategic approach.

Ben Kus: If you go back to the analogy of where people store their structured data, it’s in many places for many different reasons. Similarly, there’s the very generically large term of unstructured data; you would store it and use it in many different ways. But for Box, one of our things we’re typically known for is to make it very easy to use, extend, secure, and be compliant for all of your data. For that, we typically would need to manage it. We have a million ways to sync data between repositories. We recently announced a big partnership with Snowflake where the structured data, the metadata about a file in Box, automatically syncs into Snowflake tables. That kind of thing is definitely part of what we think about.

But in general for Box, it’s key that we offer so much AI, in many cases for free on top of the data you have, even though it’s quite expensive, because we want people to bring their data and get all the benefits of security, collaboration, and AI. But we don’t believe we’re going to be the only people in this AI and agentic ecosystem, which is why we partner with basically everyone. We believe there will be these major enterprise platforms that every company will be looking at. Our job is to give the best option for them for unstructured data and then integrate with everybody else so that you can have our AI agents working with other companies in addition to custom AI agents that you build yourself. Because we’re unstructured data and a lot of people need to use it, we integrate with other platforms, non-AI in addition to AI integrations that let other companies call into our AI capabilities to ask about data, do deep research, do data extraction, and so on.

Nataraj: Was there a moment within the company where you guys realized that this is a big shift? Box has been around for almost 20 years, starting in 2005. Was there an internal moment where you said, ‘Okay, this is really big for us?’

Ben Kus: Sure. If you look back five or six years for the term ML and unstructured data, you’ll find we had a lot of big announcements around how Box uses ML to structure your data. So taking unstructured data and structuring it is a big thing we’ve done for many years. We’ve always been trying to be on the bleeding edge of what’s available. But there was this challenge. Imagine a company with forms people are filling out, or documents, contracts, leases, research proposals, images—anything a company does day-to-day. If you were to have AI or ML help you, it would be training a model. You’d get a data science team together or buy a company. We would see that getting an ML model to handle contracts and structure them was too complicated. You’d need a model not just for contracts or leases, but for commercial leases in the UK in the last three years. You’d have a model for that, and it didn’t really work that well. You’d have to train and customize it a lot.

That was the nature of how it used to be. When Generative AI came out, we were watching the early days of GPT-2 style models, and it was okay. But somewhere around the time ChatGPT came out, with GPT-3.5 style models, you suddenly saw this amazing moment where a general-purpose model could actually start to outperform the specialized models. It could do things you never even would have bothered to try, like, ‘What is the risk assessment of this contract?’ or ‘Can you describe whether you think this image is production-ready for a catalog?’ You couldn’t even imagine the feature set you would give a traditional ML model. But Generative AI could kind of do it. As it got better, GPT-4 was this big, ‘Oh wow,’ moment where some of the challenges of the older models were being fixed. GPT-3.5 was the moment where we said, ‘Let’s just go back and retrofit everything about Box to be able to apply AI models on top of it,’ so you could do things like chatting through documents and extracting data. It was amazing how fast you could get things working and get them working better than you ever had before, even after spending a ton of engineering resources on trying to get something working. An hour and a half of using one of the new models actually gave you better performance. That was a big aha moment. And then of course you realize you’ve got 90% of the problem, and the last 10% is going to take all your time going forward. But since then, all of our efforts have been around preparing Box to be an AI-first platform. We often talk internally, ‘What if we were building Box tomorrow?’ It clearly would be an AI-first experience. So why don’t we do that? That’s just part of our mentality.

Nataraj: What are some of the earliest use cases that you launched at Box, and how has the enterprise customer adoption been? In enterprises, we often see the cycle of adoption is a little bit slower.

Ben Kus: Some of the first features we launched were around the idea that if you’re looking at a document, you need to have an AI next to you to help you chat with it. I’ve got a long document, a long contract, this proposal—help me understand it. It’s almost like an advanced find. That was a simple feature, but it was this new paradigm. And then we added the concept of RAG, not just for a single document but across documents. You can implement chunking, vector databases, and the ability to find the answer to your question, not just a document like in a search. I’ve got 100,000 documents here in my portal of product documentation. As a salesperson, I need to find the answer to this question. I ask it, and the AI will ‘read’ through all of it using RAG and provide the answer.

For enterprises, they were scared, and some of them still are, about AI because it’s so different. Data security is critical. No matter the benefit of AI, if you’re going to leak data, no one’s going to use it. In many cases, for bigger organizations, the first AI they’re actually using on their production data is Box, partially because it’s very hard for them to trust AI companies. You need to trust the model, the person calling the model, and the person who has your data. Since Box is that whole stack for them, they were able to say, ‘I trust that your AI principles and approach will be secure.’ Then they’re able to start with some of the simple capabilities. One of the more exciting ones is data extraction, where you have contracts, project proposals, press releases. There’s an implicit structure to them. You want to see the fields, like who signed it, what time, what are the clauses. Then you can search and filter on that data. Enterprises look at that and say these are very practical benefits. They get through their AI governance committees, security screenings, and ensure nobody trains on their data. That’s the scariest thing to them. We have to go in and meet with the teams, explain every step, show them the architecture diagrams, and the audit reviews so they know their data is safe. That’s typically their number one concern.

Nataraj: I want to talk a little bit about the cost of leveraging AI. It has dramatically gone down. Are you seeing improvement in your margins by creating AI products? How is it directly impacting your profitability?

Ben Kus: This is a particularly hard problem. We’re a public company. We publish our gross margin, our expenses. It’s not practical for us to do something that would double our expenses. Nobody has $100 million laying around to apply to whatever cool ideas. At the same time, it’s very clear that if you’re too worried or stingy about your AI bills, you will lose to somebody who is just trying harder. There’s been a really nice byproduct of all the innovation in chips, models, and efficiency—they’re much cheaper than they used to be. Sam Altman said a few years ago that models would get dramatically cheaper, but you’re also going to find you’ll use them more and more, which will slightly offset that. That’s exactly what we found. We are doing way more tokens than we did previously, by orders of magnitude. However, we’re now utilizing the cheaper models, and they’re just offsetting.

When you get to agentic capabilities, like deep research on your data, that’s way different than RAG. RAG might use 20,000 tokens. But for deep research, you might go through many documents, 10,000 tokens at a time, maybe 50,000, 100,000, and then reprocess that. You might spend hundreds of thousands of tokens or more. That’s a massive exponential growth in your AI spend. But you get a great result. Deep research on your own data is revolutionary. The way we approach it is to give AI for free as much as possible because that’s what an AI-first platform would do. Sometimes, for very high-scale use on our platform, you can pay. But whenever possible, we’re going to eat the costs ourselves and handle that risk because that’s what you want out of your best products. Nobody wants to sit there and worry when they’re clicking on things that it’s going to cost them. So we try to protect ourselves with some resource-based pricing but also just say AI is part of the product. That’s our philosophy.

Nataraj: What do you think about pricing based on usage versus pricing based on outcomes? I’m assuming you’re following the regular per-seat, per-subscription model.

Ben Kus: Yep. We’ve been through every single possible flavor of this. I hope business schools are doing case studies on how everybody had to rethink technology pricing. At the end of the day, pricing a product isn’t just about the supply side cost; it’s about what people are willing to pay and how they’re willing to pay for it. When we originally launched our AI, we had seen some people who launched AI were charging too much and people weren’t ready for that. Then there was this massive trend of $20 a month for enterprise-style tools, and the adoption was terrible because nobody quite knew what to do with it. So we decided to offer it as free as part of our product, but we put a limit on it. If you did too much, it would stop.

But then enterprises would actually not turn it on because they were worried they would hit those limits and then everybody would be mad at them. The limit became an adoption barrier. So we got a lot of feedback from our customers and turned that off. There was no limit. Now, there’s the idea of abuse we could address. You can’t just buy a seat to Box and use the API to power another system. But for normal usage, we handle that risk. It’s incredibly expensive if you look at public cloud rates for transferring and storing data. We’re used to infrastructure expenses. So we decided we’re going to eat the cost of it as a way to deliver better services to our customers. That is our continuous philosophy.

Nataraj: Storage is a horizontal use case, but AI is also being used to build vertical-specific products, like Cursor for developers or Harvey for legal assistants. Have you evaluated creating specific products on top of Box for different verticals?

Ben Kus: This is a very fundamental question for any company: am I going to focus on a specific vertical and a problem, or am I going to focus generically across the board? At Box, one of our product principles is to focus on the horizontal IT use cases. Much of our value proposition is across the whole environment. Everybody in the company wants the security features, the compliance features, the sharing features. This is why we talk about it as content or files—everybody needs files. Some companies specialize and talk about contracts and clauses, or digital assets and marketing materials. This is a big question for any startup: go deep or go broad. If you go deep, you can make more targeted products. But your total market size is diminished. For us at Box, no one industry makes up more than 10% of our overall business. We have a giant market, but the more you specialize, the more you’re probably not going to solve a problem for somebody else.

The interesting part about AI is that it pulls you in two different directions at once. Some people will start to use AI to very specifically solve problems, like in life sciences or financial services. But at the same time, in some cases, a generic AI can actually solve what a historical specialized company used to do. In which case, people might go back to a generic solution so they don’t have a million point solutions. You always have to analyze how deep to go in an industry versus how much you can provide horizontally. AI reshuffles it.

Nataraj: You guys are one of the first companies to adopt being an AI-first company. What does that mean and how does it change how you operate?

Ben Kus: When we use the word ‘AI-first,’ we think about building a feature knowing the full abilities of AI. Search is an interesting example. The historic way you would build search is completely different from how you would build it in a world with an AI or agentic experience. Not just from a technology perspective with better vector embeddings, but also from the technique. People act differently when they go to a search box than when they are talking to an agent. Many people use ChatGPT or Gemini for internet searches, and what you type into Google versus your chat experience is different.

That’s an interesting moment for Box. If you think AI-first, you don’t just put an AI thing inside a search box. You rethink the search experience from the beginning. We announced our agentic search, or deep search, where you ask an AI, and it will not just go through a complicated search system, but it will look at the results and figure out whether those results match what you’re looking for. It goes well beyond RAG and into using intelligent agents to loop and figure out if they have the best answer or if they need to try again. Thinking that way, not just ‘I have a model, I want to use it,’ but ‘What can AI do for you?’, especially if you think agentically, becomes a different product process, a different engineering process, a different strategy process. You start to invest heavily in your AI platform layers and common AI interactions in your products, like an agentic experience or AX. If you’re going to be an AI-first company, you need to examine the fact that maybe AI will change the way you’ve done something traditional.

Nataraj: We went through RAGs, we went through copilots, and now we are seeing agents. How are you thinking about agents within Box? What is your definition of an agent?

Ben Kus: My definition of an AI agent, technically speaking—and Harrison from LangChain has a fun definition—is that an agent is something that decides when it’s done. Normally, you run code and it completes. But an AI agent needs to figure out when it’s done. That’s a good technical definition. I have a slightly more detailed engineering answer: an agent has an objective, instructions, an AI model (a brain), and tools it can decide to utilize with context to operate. I’m a fan of agents that can call on other agents, like a multi-agent system.

When I’m thinking about agents, I’m thinking about multiple agents cooperating. To me, the power of agents going forward is this idea that you can think about them as little state diagrams of intelligence that can loop and do more sophisticated things. This is a very different thought process for most engineers. You asked for an example. One is deep research. To do deep research in Box, you have to search, look at the results, get the files, make an outline, create the prose, and then critique it. That’s like 15 steps for these agents. We call that a deep research agent, but it has a multi-agent workflow to process that. I don’t know if you could have done deep research very well previously because there are too many paths to handle. It’s the kind of thing that works really well for an intelligent system like an agent to orchestrate.

Nataraj: Do you see any form factor for agents? In an enterprise product sense, how does that form factor play out?

Ben Kus: There’s the AI models concept, which is more of a developer concept. Then there’s the idea of an AI assistant, where you have something there to help you in context, but it’s typically one-shot. The term ‘agentic experience’ (AX) is very interesting in this form factor discussion. OpenAI, Anthropic, and Gemini do a great job of building valuable capabilities into their agentic experiences. You go to ChatGPT or your favorite tool, ask a question, and it just figures out, ‘I’m going to search the internet, I’m going to do deep research for you.’ This idea that you go in and ask a system to do something, and it can recognize your context, is critical. Context engineering is a critical aspect of agentic stuff going forward. This might be the new form factor.

At Box, when you’re on our main screen, what you want to do is very different than if you have a file open or if you’re looking at all your contracts or invoices. The hard engineering and product problem is to make agents that figure out what you might want at that point. We think about building an agent that handles a certain flow but first figures out what the user wants, and then does a search or queries the system or brings data together. That context engineering is critical. I believe context engineering is one of the more interesting areas developing, and it will be something that everybody will want to hire for soon.

Nataraj: Let’s touch upon productivity. How much productivity improvement are you seeing within your company? And there’s a group of people panicking that AI is going to destroy jobs, starting with developers.

Ben Kus: For productivity for our customers, we see people start to use AI a little bit skittishly, and then they use it more and more over time. Especially in enterprises, adoption starts slow, but then they start to add it in big chunks, and you see an acceleration of usage over time.

Internally, we have seen benefits from using assisted tools for our developers, like GitHub Copilot and Cursor. As the models and integrations have gotten better, they are helping us overall. We don’t think of it as, ‘We can save money and have fewer developers.’ Instead, we’re like, ‘If 25% of our code is written by AI, that’s 25% more we can do to deliver value to customers.’ We’re not constrained by a fixed amount of output we want from our developers; we want more. If tools help people become more productive, that’s wonderful.

Economically speaking, I’m not a believer in the lump of labor fallacy—that there’s only a fixed amount of things people want to do. I think it’s the opposite. If things get better and cheaper, you want more of it. We want more videos, content, marketing, and internal content because new avenues are now possible. Now, there’s an important aspect: if change happens too quickly, it can be very disruptive. I’m very sensitive to the plight of people in the middle of a disruption. But I see this as a tool to help companies do more. You need good people using AI to help them, as opposed to cutting whole areas.

Nataraj: Some CTOs have the opinion that they no longer need a lot of junior developers. I always thought this is actually much better for junior developers because if it was taking them three or four years to become senior, it will now take them one year. What’s your take?

Ben Kus: What you said is true. When you add a junior developer, you often expect a relatively small level of output compared to more senior people. But now, a person who’s really good at using the latest tools is actually quite productive, and that’s a big value. At Box, we have the most developers we’ve ever had, and we’re not only hiring senior people; we’re hiring across the whole spectrum. We just expect people to be able to use tools. Anecdotally, I see that people coming out of school now have always known AI-assisted coding, and they’re good at it compared to somebody who’s been around for a long time and might be resisting it. Also, in areas like context engineering, which is a slightly different form of coding, some of our most successful context engineers are relatively junior in terms of how long they’ve been out of school but really excel at that kind of thing.

Nataraj: An audience member asks: can you share a little bit about document parsing and how you’re extracting from those documents and what models or technologies you’re using behind the scenes?

Ben Kus: In this world of handling unstructured data, there’s a set of things you always need to do. You have all these different file types. The first thing is to get it to a usable format. Markdown is a great format. Sometimes you have scanned documents or different formats. There’s a big conversion as a first step. Many people talk about PDFs because of all the weird things that go into them. A PDF is not a good format for AI to figure out; it needs to be converted. So step one is to convert it to text with some limited style support like markdown. Then you typically go through and chunk it. You want to make a vector out of the most important section of data. You want it to contain a whole thought. You wouldn’t do it per sentence, but if you did it for giant pages, you’d end up with too many confused topics. So you want a vector to indicate what that area is about. Paragraphs work well at a high level, but then you need more advanced chunking strategies. Then you stick that into a vector database or put the text into your traditional search database.

Nataraj: Are you building your own proprietary tools for this, or are you using things like LangChain with Pinecone or other vector DBs?

Ben Kus: My philosophy and the philosophy of Box is that we love all the tools that everybody makes. If people are building the best tool out there—the best vector database, the best document chunker, the best agentic framework—we want to use it. I gave a speech recently at the LangChain conference about the benefits of something like LangGraph. When we started, we had built our own because this stuff wasn’t available at the time. But we are more than happy to go back and retrofit to some of the other systems. I’m very impressed at how good vector databases have gotten in the last few years. Why would we bother to rebuild the things that people are doing such a great job building, especially in the open-source community, or tools that we can buy? We’re big fans; we will replace stuff that we just built because something better is available. With AI, you kind of have to reevaluate every six months.

Nataraj: What about the models you’re using? In an enterprise, you want to adopt the latest and greatest, but you also want to be secure.

Ben Kus: We made a decision a long time ago not to build models, and I’m super happy we did that. Also, we are going to support all of the best models that are trustworthy. For us right now, we support OpenAI-based models, Anthropic’s Claude models, Llama-based models, and Gemini. We consider those to be some of the best models out there. Not only do we support them, but we support them on a trusted environment. This is critical for many enterprises. For example, AWS Bedrock is a very trustworthy environment to run the Claude or Llama models. IBM will support Llama models for you. These are trustworthy names from an enterprise perspective.

We utilize these trusted providers and trusted models, and then we pick which model works best for a given task. Gemini is great for data extraction. GPT-style models are great for chatting. They’re all pretty close these days, the leading models. But we let our customers switch as they want. If somebody says, ‘I really think this data extraction is best for Claude,’ we let them do it. We support all of the models, and one of our goals is to support them as they come out. This is very expensive and painful internally because how you properly prompt and context-engineer for Claude is different from Gemini, which is different from OpenAI models. But for enterprises, they often have preferences, and our job as an open platform is to handle those.

Nataraj: One final question. If you were building something now, are there any ideas that you would go and attack?

Ben Kus: It’s a very good question. There are a lot of startups out there doing really interesting things. One interesting idea is to look at areas where an old-school traditional software approach could be disrupted, but maybe it’s so old that people don’t really think it’s cool or interesting anymore. Finding something that is very valuable but not as in the news might be a good approach. Anything we’re talking about all the time will probably have so much competition that you might be behind.

But I will highlight one thing. If you see something like Cursor—nobody talked about Cursor a couple of years ago. They were up against Microsoft Copilot, one of the biggest companies in the world. An interesting thing is that with Cursor, you start to realize that even though people are using AI to solve a problem, there might be a better way. If you can make a really good product, even despite the VC advice that you’ll never make it in a ‘kill zone,’ you might have a chance. Often, that’s very good advice, but if you really believe you can do it better, it’s a dangerous path, but there are demonstrations of people who built a really good product. I believe those still have a chance in these crazy AI times to become large companies because they just solved the problem really well.

Nataraj: Because Cursor literally cloned VS Code. They thought the UI could be better on just that product and that’s the main differentiation.

Ben Kus: There are a lot of dynamics that go into any existing product. Sometimes a fresh look at it, even a problem that seems solved, can be helpful.

Nataraj: This was a great conversation, Ben. Thanks for coming on the show.

Ben Kus: Excellent. Well, thanks for having me on. It was a fun chat.

This conversation with Ben Kus highlights the practical, strategic thinking required for enterprises to successfully adopt AI. By focusing on security, embracing a multi-model approach, and rethinking core product experiences, companies can unlock the immense potential of their unstructured data.

→ If you enjoyed this conversation with Ben Kus, listen to the full episode here on Spotify, Apple, or YouTube.

→ Subscribe to ourNewsletter and never miss an update.

August 10, 2025