How Adaption Labs Is Rethinking AI With Continual Learning | Sudip Roy, Co-founder & CTO of Adaption Labs

5 Things You'll Learn from This Episode

Why the AI industry is shifting from scaling compute and ever-larger foundation models toward smaller, verticalized models that can deliver comparable results at far lower cost.
What is really driving AI inference costs higher — reasoning tokens, agentic background jobs, and the autoregressive nature of LLMs — even as the cost per token has fallen roughly 300x.
How "gradient-free continual learning" lets AI systems adapt almost instantly from human feedback instead of waiting weeks or months for a retraining run.
Why the most valuable AI innovation is moving out of the model itself and into the full stack — the interfaces, harnesses, and feedback loops built around it.
What Adaption Labs is building with Adaptive Data and AutoScientist to let any enterprise create custom vertical models without an in-house team of ML researchers.
Why the "last 5%" reliability gap is where AI value really lives — and how continuously learning systems can close it over time.

About the Episode

Sudip Roy is the Co-founder and CTO of Adaption Labs, where he is building AI systems that continuously learn and adapt rather than relying on ever-larger foundation models. A former director at Cohere and longtime engineer at Google Brain and DeepMind — where he co-authored TensorFlow Extended (TFX) and helped build the Pathways system behind the Gemini models — Sudip joins Nataraj to unpack why the AI industry is pivoting from raw compute scaling to inference-layer efficiency, what is really driving inference costs, and how gradient-free continual learning could finally close AI's stubborn "last 5%" reliability gap.

Timestamps

0:00 — Introduction: AI's two trends — scaling compute vs. inference efficiency
1:34 — Sudip's background: Google Brain, TFX, and the Pathways system
4:28 — Joining Cohere and seeing the potential of large language models
5:33 — Specialized models vs. foundation models: what changed
10:26 — Why AI inference costs keep rising despite 300x cheaper tokens
14:14 — The autoregressive nature of LLMs and why it breaks traditional systems
18:18 — Founding Adaption Labs: the thesis and gradient-free continual learning
20:39 — Taking a full-stack view of AI
26:51 — Continual learning in practice: the customer support agent example
30:20 — Adaptive Data: solving the enterprise data problem
31:51 — AutoScientist: co-optimizing data and models
37:26 — Adaptive interfaces: task-specific ways to consume AI
45:27 — Closing AI's "last 5%" reliability gap
49:47 — The demand-supply gap and systemic efficiency
52:48 — The shift from training to inference and decentralized compute

Key Insights

Q: Why are AI inference costs still rising if the cost per token has dropped?

On a per-token basis, inference costs have fallen by roughly 300x. But total spending on AI has exploded because the number of tokens consumed per task has skyrocketed — reasoning models "think" through far more tokens, and agentic workloads spawn parallel background jobs that can run for hours. Sudip estimates demand has grown by a thousand to a million times, creating a demand-supply gap that makes inference feel expensive even as unit costs fall.

Q: What is gradient-free continual learning?

It is Adaption Labs' approach to letting AI systems improve continuously without updating the model's weights through gradient descent. Instead of waiting weeks or months for a new training run, the system learns from its interactions with the environment — human feedback, other agents, or sensors — so behavior changes feel almost instantaneous. The goal is intelligence that evolves naturally and gracefully as the world around it changes.

Q: Why does Adaption Labs take a "full-stack" view of AI instead of focusing only on the model?

Sudip argues that most recent AI innovation has moved out of the model itself and into the systems built around it — the interfaces, harnesses, and feedback loops. The interface is the primary mechanism for collecting feedback, the harness folds that feedback back into the model, and the model executes. By co-optimizing across all three layers, Adaption believes it can uncover solutions that are simply not possible by improving any single layer in isolation.

Q: What is Adaptive Data and who is it for?

Adaptive Data is Adaption Labs' first product. It transforms low-quality or missing data into high-quality supervised fine-tuning (SFT) datasets that enterprises can use to customize models. It is aimed at companies with sensitive private data they cannot use as-is, or that want a custom vertical model but are starting with little or no usable data — for example, teams serving low-resource languages or needing long-context training data.

About Sudip Roy

Sudip Roy is Co-founder and CTO of Adaption Labs, where he is building AI systems that continuously learn and adapt across the full stack rather than depending on ever-larger foundation models. He was previously a director at Cohere, where he led inference and shipped the serving and fine-tuning infrastructure. Before that, he spent years at Google Brain and DeepMind, where he co-authored TensorFlow Extended (TFX) — the platform powering production machine learning across Google — and helped build the Pathways system used to train and serve the Gemini family of models. He holds a PhD in data management systems from Cornell University.

Co-founder & CTO at Adaption Labs

About the Host

Nataraj Sindam is the creator of The Startup Project, a podcast featuring founders, investors, and operators building the future.

Twitter Newsletter Website

#StartupProject #AdaptionLabs #SudipRoy #ContinualLearning #AIInference #FoundationModels #MachineLearning #AIInfrastructure #LLM #Entrepreneurship #Podcast #Tech