
How Adaption Labs Is Rethinking AI With Continual Learning | Sudip Roy, Co-founder & CTO of Adaption Labs
with Sudip Roy — Co-founder & CTO
Sudip Roy, Co-founder and CTO of Adaption Labs, joins Nataraj Sindam to explain why the AI industry is shifting from scaling compute and ever-larger foundation models toward inference-layer efficiency and smaller, verticalized models. Drawing on his time at Google Brain, DeepMind, and Cohere, Sudip breaks down what is really driving AI inference costs, why the most valuable innovation is moving out of the model and into the full stack, and how gradient-free continual learning could finally close AI's stubborn "last 5%" reliability gap.
5 Things You'll Learn from This Episode
- Why the AI industry is shifting from scaling compute and ever-larger foundation models toward smaller, verticalized models that can deliver comparable results at far lower cost.
- What is really driving AI inference costs higher — reasoning tokens, agentic background jobs, and the autoregressive nature of LLMs — even as the cost per token has fallen roughly 300x.
- How "gradient-free continual learning" lets AI systems adapt almost instantly from human feedback instead of waiting weeks or months for a retraining run.
- Why the most valuable AI innovation is moving out of the model itself and into the full stack — the interfaces, harnesses, and feedback loops built around it.
- What Adaption Labs is building with Adaptive Data and AutoScientist to let any enterprise create custom vertical models without an in-house team of ML researchers.
- Why the "last 5%" reliability gap is where AI value really lives — and how continuously learning systems can close it over time.
About the Episode
Sudip Roy is the Co-founder and CTO of Adaption Labs, where he is building AI systems that continuously learn and adapt rather than relying on ever-larger foundation models. A former director at Cohere and longtime engineer at Google Brain and DeepMind — where he co-authored TensorFlow Extended (TFX) and helped build the Pathways system behind the Gemini models — Sudip joins Nataraj to unpack why the AI industry is pivoting from raw compute scaling to inference-layer efficiency, what is really driving inference costs, and how gradient-free continual learning could finally close AI's stubborn "last 5%" reliability gap.
Timestamps
- 0:00 — Introduction: AI's two trends — scaling compute vs. inference efficiency
- 1:34 — Sudip's background: Google Brain, TFX, and the Pathways system
- 4:28 — Joining Cohere and seeing the potential of large language models
- 5:33 — Specialized models vs. foundation models: what changed
- 10:26 — Why AI inference costs keep rising despite 300x cheaper tokens
- 14:14 — The autoregressive nature of LLMs and why it breaks traditional systems
- 18:18 — Founding Adaption Labs: the thesis and gradient-free continual learning
- 20:39 — Taking a full-stack view of AI
- 26:51 — Continual learning in practice: the customer support agent example
- 30:20 — Adaptive Data: solving the enterprise data problem
- 31:51 — AutoScientist: co-optimizing data and models
- 37:26 — Adaptive interfaces: task-specific ways to consume AI
- 45:27 — Closing AI's "last 5%" reliability gap
- 49:47 — The demand-supply gap and systemic efficiency
- 52:48 — The shift from training to inference and decentralized compute
Key Insights
Q: Why are AI inference costs still rising if the cost per token has dropped?
On a per-token basis, inference costs have fallen by roughly 300x. But total spending on AI has exploded because the number of tokens consumed per task has skyrocketed — reasoning models "think" through far more tokens, and agentic workloads spawn parallel background jobs that can run for hours. Sudip estimates demand has grown by a thousand to a million times, creating a demand-supply gap that makes inference feel expensive even as unit costs fall.
Q: What is gradient-free continual learning?
It is Adaption Labs' approach to letting AI systems improve continuously without updating the model's weights through gradient descent. Instead of waiting weeks or months for a new training run, the system learns from its interactions with the environment — human feedback, other agents, or sensors — so behavior changes feel almost instantaneous. The goal is intelligence that evolves naturally and gracefully as the world around it changes.
Q: Why does Adaption Labs take a "full-stack" view of AI instead of focusing only on the model?
Sudip argues that most recent AI innovation has moved out of the model itself and into the systems built around it — the interfaces, harnesses, and feedback loops. The interface is the primary mechanism for collecting feedback, the harness folds that feedback back into the model, and the model executes. By co-optimizing across all three layers, Adaption believes it can uncover solutions that are simply not possible by improving any single layer in isolation.
Q: What is Adaptive Data and who is it for?
Adaptive Data is Adaption Labs' first product. It transforms low-quality or missing data into high-quality supervised fine-tuning (SFT) datasets that enterprises can use to customize models. It is aimed at companies with sensitive private data they cannot use as-is, or that want a custom vertical model but are starting with little or no usable data — for example, teams serving low-resource languages or needing long-context training data.
About Sudip Roy
Sudip Roy is Co-founder and CTO of Adaption Labs, where he is building AI systems that continuously learn and adapt across the full stack rather than depending on ever-larger foundation models. He was previously a director at Cohere, where he led inference and shipped the serving and fine-tuning infrastructure. Before that, he spent years at Google Brain and DeepMind, where he co-authored TensorFlow Extended (TFX) — the platform powering production machine learning across Google — and helped build the Pathways system used to train and serve the Gemini family of models. He holds a PhD in data management systems from Cornell University.
Co-founder & CTO at Adaption Labs
About the Host
Nataraj Sindam is the creator of The Startup Project, a podcast featuring founders, investors, and operators building the future.
Twitter · Newsletter · Website
#StartupProject #AdaptionLabs #SudipRoy #ContinualLearning #AIInference #FoundationModels #MachineLearning #AIInfrastructure #LLM #Entrepreneurship #Podcast #Tech