The Deep Feed

01 — Simon Willison

The $1,500 Ceiling

Why Uber is capping its engineers' AI spending

By Simon Willison · 6 min read

Editor's note: A pragmatic look at how the era of unlimited token-burning is hitting the reality of corporate balance sheets.

Uber recently burned through its entire 2026 AI budget in just four months. This wasn't a failure of strategy, but a failure of foresight. When budgets are set in 2025, they cannot account for the sudden, explosive popularity of agentic coding tools like Claude Code or Cursor. These tools do not just suggest lines of code; they consume tokens at a rate that can bankrupt a departmental budget if left unchecked. The response from Uber is a hard limit: $1,500 per month, per employee, per tool. It is a blunt instrument, but a necessary one.

The Math of Engineering Productivity

To understand why $1,500 matters, we have to look at the cost of a human engineer. In the US, the median compensation for an Uber software engineer sits around $330,000. By capping AI tool spending at $3,000 per year (assuming two tools), Uber is effectively saying that they are willing to spend roughly 11% of an engineer's salary on their digital co-pilot. This is a calculated bet. If the AI doesn't provide a return on that 11% investment, the tool is a net loss for the company.

An AI spending cap of 11% of an engineer's salary is a rational bet on productivity, not a sign of scarcity.

This move signals a shift away from the 'token-maxxing' culture. For the past year, the industry has been obsessed with who can run the largest models and consume the most context. We saw leaderboards encouraging engineers to compete for the highest usage. Uber is ending that competition. They are treating AI tokens like electricity or cloud compute: a utility that must be metered and managed, rather than an infinite resource for experimentation.

The logic of the cap

Prevents budget exhaustion in the first quarter
Establishes a clear ROI threshold for AI tools
Moves AI from an experimental luxury to a managed utility

The real question is whether $1,500 is enough. For a developer working on complex, agentic tasks that require deep reasoning and massive context windows, these limits might feel like a tether. If the most productive work requires the most tokens, then capping tokens effectively caps the ceiling of what an engineer can achieve. Uber is betting that the most valuable work happens within these bounds.

Key Takeaway

AI tools are transitioning from experimental toys to managed corporate utilities with strict ROI requirements.

02 — Stratechery

The End of Windows

Satya Nadella and the shift toward unmetered intelligence

By Stratechery · 12 min read

Editor's note: An analysis of whether Microsoft is still a software company or something entirely different.

At the recent Microsoft Build conference, Satya Nadella appeared less like a software executive and more like an infrastructure architect. The focus has shifted away from the operating system and toward the AI stack. For decades, Windows was the organizing principle of the Microsoft empire. But in the age of generative models and autonomous agents, the OS is becoming a secondary concern. Nadella seems to understand that the real value lies in the platform that facilitates intelligence, regardless of where that intelligence resides.

The Silicon Bet

The unveiling of the Nvidia RTX Spark chip, developed alongside Microsoft, highlights this transition. This is an attempt to bring 'unmetered intelligence' to the edge—directly onto the personal computer. The goal is to allow autonomous agents to run locally, handling long-running tasks with massive context windows without constantly pinging the cloud. It is a play for the desktop, but it is not a play for Windows as we know it. It is a play for the hardware that can host the next generation of agents.

The operating system is no longer the center of the universe; the intelligence layer is.

However, there is a tension here. While Nvidia and Microsoft push for local AI PCs, the reality of modern AI—the need for massive memory bandwidth and specialized reasoning—often makes cloud inference superior. A chip that spends too much space on GPU cores at the expense of the CPU might be great for a 2023-era chatbot, but it may struggle with the agentic workflows of 2026. The hardware must keep up with the software's appetite for reasoning, not just pattern matching.

The challenges of the AI PC

Balancing GPU power with CPU performance for agents
Managing the massive memory requirements of long context
Overcoming the software limitations of Windows on ARM

Nadella's Microsoft is moving toward a model where they provide the platform and the brand permission for companies to build on top of it. They are no longer interested in being the company that makes every tool; they want to be the company that makes the tools possible. This is a move from product to ecosystem, a transition that requires letting go of the old guard of software dominance.

Key Takeaway

Microsoft is pivoting from an operating system company to an intelligence platform company, de-emphasising Windows to capture the AI stack.

03 — Dwarkesh Podcast

The Scarcity of Being Human

What remains valuable in an age of AGI?

By Dwarkesh Patel · 15 min read

Editor's note: A deep dive into the economic fallout of automation and the survival of human value.

As artificial intelligence approaches the ability to perform almost any cognitive or physical task, economists are grappling with a fundamental question: what will be scarce? In a world where intelligence and labor are effectively infinite and near-zero cost, the traditional drivers of value collapse. If a machine can design a building, write a legal brief, and manage a supply chain, the value of those skills evaporates. We are looking at a potential decoupling of productivity from human labor.

The Relational Sector

Alex Imas of Google DeepMind suggests that the next frontier of value is the 'relational sector'. This refers to goods and services where the human element is not a bug, but the primary feature. We don't just want a coffee; we want the interaction with the barista. We don't just want a performance; we want to know a human being is expressing something. In this scenario, scarcity moves from the 'what' to the 'who'. The value accrues to the things that are inherently human-to-human.

When intelligence becomes a commodity, human presence becomes a luxury.

This creates a strange economic bifurcation. On one side, you have a massive, hyper-efficient 'machine economy' that produces goods and services at scale. On the other, you have a 'human economy'—a smaller, more expensive sector where humans perform services for other humans. The tension lies in how wealth flows between these two. If the machine economy generates all the surplus, how does that wealth reach the humans living in the relational sector?

Candidates for future scarcity

Human-in-the-loop services
Physical presence and interpersonal connection
Unique, non-replicable human experiences

The risk is that the human economy becomes a shrinking island in a sea of automated abundance. If the machine economy is a closed loop—where machines build machines and optimize for their own efficiency—the human element could be sidelined entirely. The challenge for policymakers is not just taxing AI, but ensuring that the wealth generated by the machine-only economy is redistributed to sustain the human-only one.

Key Takeaway

In an automated world, value will migrate from cognitive tasks to the relational and human-centric sectors.

04 — Lenny's Newsletter

The Identity Clone

The 15-minute path to a digital avatar

By Claire Vo · 8 min read

Editor's note: A practical look at the collapsing cost of video production and the rise of the digital twin.

The barrier to entry for high-end video production has just been demolished. Using Google's Gemini Omni and Flow, it is now possible to create a convincing AI avatar of yourself in under fifteen minutes. This isn't just about making a funny clip; it is about the ability to scale your presence. A single person can now act as their own director, cinematographer, and actor, generating professional-grade content without a studio, a crew, or a budget.

The Workflow of the Self

The process is deceptively simple: a phone scan of your face, a prompt to generate a storyboard, and then the iterative generation of video scenes. The real breakthrough is character consistency—the ability to ensure that the digital version of you looks the same in a boardroom as it does on a hiking trail. Once the scenes are generated, they are stitched together in a browser-based editor. What used to take a week of post-production now takes the time it takes to drink a coffee.

Video AI tools unlock creative possibilities for people with zero production skills.

However, the technology is not yet perfect. The 'uncanny valley' remains a significant hurdle. There are moments where the physics of a movement feel slightly off, or where the emotional expression doesn't quite match the intent. These glitches serve as a reminder that while we can mimic the appearance of a human, we are still learning to replicate the subtle, non-linear nuances of human movement and emotion.

The new production stack

AI as creative producer (storyboarding)
AI as actor (avatar generation)
AI as editor (scene stitching)

The implication for marketing and personal branding is massive. We are entering an era of hyper-personalized content. If you can clone yourself for pennies, you can be everywhere at once. You can deliver a thousand different messages to a thousand different people, all while maintaining the 'face' of the brand. The question is no longer about whether you can produce content, but whether you can maintain authenticity when your presence is automated.

Key Takeaway

The collapse of video production costs enables individual scaling but threatens the value of human presence.

05 — Simon Willison

The Licensing Lie

Microsoft's 'clean data' claim vs. the reality of the web crawl

By Simon Willison · 7 min read

Editor's note: A critical look at the tension between corporate claims of ethical AI and the reality of training data.

Microsoft recently announced two new models, MAI-Thinking-1 and MAI-Code-1-Flash, with a specific marketing angle: they were trained on 'enterprise-grade, clean and commercially licensed data'. In an industry currently reeling from copyright lawsuits, this was a bold claim. It suggested that Microsoft had found a way to bypass the messy, legally dubious practice of scraping the open web to build their intelligence. It promised a new era of 'clean' AI.

The Technical Reality

The reality, as revealed in the technical papers, is far less tidy. Despite the branding, the models are still built on a massive crawl of the public web. Microsoft uses a proprietary crawl of 1.2 trillion pages, which is then filtered down to 794 billion pages by removing adult content, piracy sites, and—crucially—content identified as being AI-generated. While they have added layers of filtering, the foundation remains the same: a massive, uncompensated ingestion of the internet's collective output.

The distinction between 'clean' and 'unlicensed' is becoming a matter of marketing rather than legality.

This discrepancy highlights a growing trend in the AI industry: the use of 'clean' as a euphemism for 'filtered'. By removing the most obviously problematic content, companies can claim a higher standard of data ethics without actually changing the fundamental nature of their data acquisition. They are not moving away from the web crawl; they are simply trying to make the crawl look more professional.

The components of the Microsoft corpus

Proprietary web crawl (794B pages)
Common Crawl (24.2B pages)
Filtered for AI-generated content and piracy

This matters because the legal and ethical foundation of these models is still being contested. If the 'clean' data is still just the public web with a better filter, then the underlying copyright issues remain unresolved. Microsoft is attempting to build a fortress of legitimacy, but the foundation is still built on the same contested ground as every other major LLM.

Key Takeaway

Corporate claims of 'clean' training data often mask the reality of continued, large-scale web scraping.

06 — Stratechery

The Reasoning Era

Why the AI PC might be a step backward

By Stratechery · 9 min read

Editor's note: An analysis of why the current hardware push might be misaligned with the direction of AI software.

The tech industry is currently obsessed with the 'AI PC'. The idea is to bring intelligence to the edge, putting powerful chips in our laptops so we can run models locally. Nvidia and Microsoft are leading this charge with the RTX Spark chip. On paper, it looks like a breakthrough. It promises more privacy, lower latency, and the ability to run autonomous agents without a constant internet connection. But if you look closely at the architecture, the logic starts to crumble.

The Misalignment of Hardware and Software

The evolution of AI has moved through distinct phases. We started in the 'ChatGPT era', where local inference was a huge goal. Then we moved into the 'reasoning era', which placed massive demands on memory bandwidth and KV cache. Now, we are entering the 'agentic era'. In this phase, the most important capability is not just generating tokens, but the ability to perform complex, multi-step reasoning tasks. This requires massive CPU performance to manage the logic and the orchestration of various tools.

The ideal local agent needs a strong CPU, not just a massive GPU.

The RTX Spark chip makes a fundamental error in its allocation of die space. It pours resources into GPU cores that, while impressive, are still inferior to the massive scale of cloud-based GPUs. In doing so, it sacrifices the CPU performance necessary for the agentic workflows of 2026. It is a chip designed for the chatbot of 2023, not the autonomous agent of today. It is trying to solve a problem that is already being solved more efficiently in the cloud.

The three eras of AI requirements

Chatbot Era: Basic inference and low latency
Reasoning Era: High memory bandwidth and KV cache
Agentic Era: High CPU performance and orchestration

The push for AI PCs may be more about selling hardware than about the actual needs of the software. There is a massive incentive for Nvidia and Microsoft to create a new category of device, even if that device is fundamentally misaligned with the direction of the frontier models. We may find ourselves buying expensive silicon that is optimized for a version of AI that is already becoming obsolete.

Key Takeaway

Current AI PC hardware is being optimized for yesterday's models rather than the CPU-heavy requirements of tomorrow's agents.