The Deep Feed

01 — Dwarkesh Podcast

The Data Black Hole

Why human intuition is a luxury AI cannot afford

By Dwarkesh Patel · 12 min read

Editor's note: A sobering look at the sheer scale of data required to mimic basic human competence.

We often speak of artificial intelligence as if it were a spark of logic, a sudden emergence of reason from clever math. This is a mistake. Intelligence, at its most practical level, is a matter of sample efficiency: how much information a system needs to consume before it can act reliably. Humans are remarkably efficient. A teenager can learn to drive a car after twenty hours of practice. A person can master a new tool in an afternoon. We possess a lifetime of accumulated physical intuition that allows us to bridge the gap between seeing and doing with minimal repetition.

The Trillion-Token Disparity

AI models operate on a different scale entirely. While a human might encounter roughly 200 million tokens of language in a lifetime, frontier models are trained on hundreds of trillions. This is not a minor difference; it is a million-fold gap. We are not building machines that think like us; we are building machines that observe more than any human could in a thousand lifetimes. This massive intake is the only way they can compensate for their lack of innate, biological efficiency. They do not 'understand' the world through experience; they statistically approximate it through sheer volume.

At the center of the glittering galaxy of AI capabilities lies an unimaginably massive black hole of data.

This data hunger explains why robotics and autonomous driving have lagged behind language models. To drive a car, a model needs to see the edge cases—the sudden pedestrian, the blinding glare, the black ice—millions of times. A human learns these through a few near-misses and years of biological evolution. An AI needs the equivalent of centuries of driving compressed into massive datasets. We are essentially trying to build a brain by sewing together a billion different grafts of human expertise, creating a Frankenstein’s monster of statistical patterns.

The Data Requirements Gap

Human language exposure: ~200 million tokens per lifetime
Frontier AI training: 10 to 100+ trillion tokens
Robotics: Millions of hours of demonstration required for simple tasks
Self-driving: Orders of magnitude more data than human driving experience

The current progress in AI is largely driven by widening this data distribution rather than improving how models learn from it. We are getting better at finding more data, not better at making the data go further. As long as the primary driver of intelligence is the sheer volume of input, the gap between human biological efficiency and machine statistical brute force will remain the defining characteristic of the field.

Key Takeaway

AI is not becoming more human; it is becoming more massive.

02 — Not Boring

The Midjourney Scanner

Moving from pixels to biology

By Packy McCormick · 10 min read

Editor's note: Midjourney is stepping out of the art studio and into the medical clinic.

Midjourney is known for generating stunning imagery from simple text prompts. Most expected their first hardware venture to be a digital canvas or a pair of smart glasses—something to help artists manifest their visions. They were wrong. The company is pivoting toward something far more fundamental: the human body. By launching Midjourney Medical, they are attempting to turn the high-resolution complexity of medical imaging into something as casual as a trip to the spa.

Ultrasonic CT: The 60-Second Scan

The proposed technology, which they call Ultrasonic CT, uses a pool of water and a ring of underwater sensors to map the body. Instead of the claustrophobic, expensive, and slow process of a traditional MRI, this system uses ultrasonic waves to send signals through the body from every angle. The goal is a full-body scan that takes no more than sixty seconds. It is a massive computational challenge: turning terabytes of acoustic data into a precise 3D map of the body, down to the millimetre.

The goal is to make medical scanning as powerful as an MRI, but as casual as a trip to the spa.

This move represents a radical expansion of what a generative AI company can be. Midjourney is essentially betting that their expertise in interpreting complex, noisy data into coherent images can be applied to the most important data of all: our internal biology. They are moving from the realm of aesthetic creation to the realm of biological truth. It is a high-stakes gamble on the idea that compute and sophisticated imaging can democratise preventative healthcare.

The Midjourney Medical Vision

Speed: Full-body scans in under 60 seconds
Accessibility: Moving away from heavy, expensive MRI machines
Method: Using ultrasonic waves and massive compute to form 3D maps
Integration: Using AI to fill in gaps in low-resolution signals

There will be skeptics, of course. The leap from generating a picture of a cat to generating a medically accurate map of a human liver is enormous. However, the direction is clear. The frontier of AI is no longer just about chatbots or art; it is about the intersection of advanced computation and the physical reality of our existence.

Key Takeaway

Generative AI is moving from the screen to the skin.

03 — Stratechery

The Anthropic Paradox

Safety as a business strategy

By Stratechery · 8 min read

Editor's note: How a company's commitment to ethics can become its most effective shield.

Anthropic has carved out a unique position in the AI arms race. While competitors focus on raw power and speed, Anthropic has built its brand around the concept of safety. This is not just a technical goal; it has become a strategic superpower. By positioning themselves as the 'responsible' alternative, they gain a level of political and social capital that their more aggressive peers lack. This allows them to navigate regulatory scrutiny in a way that looks fundamentally different from the rest of the industry.

The Self-Serving Safety Narrative

There is a subtle tension in Anthropic's approach. Every action the company takes to ensure safety also serves its own business interests. When they implement constraints or refuse certain types of model access, it can be framed as a moral necessity, even when it also functions as a way to manage risk or comply with government pressure. This creates a powerful feedback loop: their commitment to safety gives them the license to act aggressively in their own interest, and those actions are then viewed through the lens of their safety mission.

Anthropic's safety superpower is that every action it takes looks, from the outside, to be self-serving, even as the company becomes convinced its motivations are pure.

This was recently tested by the Trump administration's export controls on the Fable model. The resulting chaos—where the model had to be made unavailable to US citizens—revealed the fragility of even the most 'safe' players. Anthropic is caught between the demands of national security and its own internal philosophy. They are attempting to build a company that is both a commercial powerhouse and a moral arbiter, a dual role that is increasingly difficult to maintain as the technology matures.

The Anthropic Strategy

Safety-first branding to mitigate regulatory risk
Using ethical constraints to manage business liability
Navigating the tension between commercial growth and moral positioning
Managing the impact of sudden government export controls

Ultimately, Anthropic is testing whether 'safety' can be a sustainable competitive advantage. In an industry defined by a race to the bottom on cost and a race to the top on capability, being the 'adult in the room' is a high-risk, high-reward strategy. If it works, they become the industry standard. If it fails, they may find themselves sidelined by the very speed they tried to moderate.

Key Takeaway

In the AI race, ethics is becoming a form of political leverage.

04 — Simon Willison

The GLM-5.2 Breakthrough

The rise of the open-weights heavyweight

By Simon Willison · 7 min read

Editor's note: A Chinese lab just released a model that challenges the dominance of US-based closed systems.

The dominance of closed-source AI models like GPT-5.5 and Claude is being challenged by a new wave of open-weights releases. The latest contender is GLM-5.2, released by the Chinese lab Z.ai. This is not a lightweight model for hobbyists; it is a 753-billion-parameter monster that has immediately claimed the top spot on several independent benchmarks. It represents a significant shift in the power balance of the AI industry, proving that the gap between proprietary systems and open-weights models is closing rapidly.

Performance vs. Efficiency

While GLM-5.2 is a leader in intelligence, it comes with a heavy cost: it is incredibly token-hungry. It uses significantly more output tokens per task than its competitors. This makes it highly capable but potentially expensive to run at scale. It is a model designed for raw performance, capable of handling massive 1-million-token context windows, but it requires significant computational resources to extract that intelligence. It is a brute-force approach to open-source excellence.

GLM-5.2 is the new leading open weights model, challenging the supremacy of closed-source giants.

One of the most impressive aspects of GLM-5.2 is its ability to handle complex coding and web development tasks. Despite being a text-only model without native vision capabilities, it ranks exceptionally high on web development leaderboards. It can generate self-contained, animated SVG illustrations that work perfectly—a feat that many multimodal models struggle with. This suggests that pure linguistic and logical reasoning can often substitute for visual training in specific, highly structured domains.

GLM-5.2 Specifications

Parameters: 753B (Mixture of Experts)
Context Window: 1 million tokens
License: MIT (Open Weights)
Key Strength: High-level coding and SVG generation

The release of GLM-5.2 is a signal to the industry. The era of the 'closed moat' is under threat. As open-weights models continue to catch up to the frontier, the value of AI companies will shift from who owns the best model to who provides the best ecosystem, the best reliability, and the most seamless integration.

Key Takeaway

The moat around closed AI models is evaporating.

05 — Simon Willison

The Sandbox Revolution

Building secure, interactive data apps

By Simon Willison · 6 min read

Editor's note: A new way to run untrusted code safely within data environments.

The challenge of modern web development is often a choice between two extremes: total freedom or total restriction. If you want to give users the ability to run custom code, you open yourself up to massive security risks. If you restrict them too much, the tool becomes useless. Simon Willison's new 'Datasette Apps' attempts to find a middle ground through a sophisticated use of sandboxing and security headers.

The Security of the Iframe

Datasette Apps are self-contained HTML and JavaScript applications that run inside a tightly constrained iframe. By combining the `sandbox` attribute with a strict Content Security Policy (CSP), Willison has created an environment where untrusted code can run without being able to access cookies, steal secrets from local storage, or make unauthorised requests to outside servers. It is a way to allow 'vibe-coded' or AI-generated tools to interact with sensitive data without the risk of exfiltration.

The magic combination is a sandboxed iframe paired with an immutable Content Security Policy.

This approach is particularly relevant in the age of LLM-generated code. As tools like Claude Artifacts become more popular, the need to run unverified, AI-written code safely becomes paramount. Datasette Apps provide a blueprint for how we can integrate these interactive, ephemeral tools into persistent data systems. It turns a simple database into a platform for custom, interactive applications.

Key Features of Datasette Apps

Sandboxed execution via iframes
Immutable CSP headers to prevent data exfiltration
Controlled communication via MessageChannel()
Read-only and write-query capabilities

By solving the security problem, Willison has unlocked a new way to interact with data. We are moving toward a future where data is not just something we query through a terminal, but something we interact with through a custom-built, temporary interface generated on the fly.

Key Takeaway

Security doesn't have to be a barrier to interactivity.

06 — Stratechery

The E-Commerce Shift

Shopping in the age of agents

By Stratechery · 9 min read

Editor's note: How AI agents are rewriting the rules of distribution and retail.

E-commerce has always been a battle of distribution. For decades, the winners were those who controlled the storefront—Amazon, Shopify, or the search engines that directed traffic to them. But the rise of AI agents introduces a new, unpredictable variable. When a customer no longer browses a list of products but instead asks an agent to 'find the best organic coffee for a French press,' the traditional levers of marketing and SEO begin to lose their grip.

From Browsing to Delegating

The shift is from a referral model to a delegation model. In the old world, a brand's goal was to be seen by a human. In the new world, a brand's goal is to be understood by an agent. This requires a different kind of data and a different kind of presence. If an agent is making the decision based on specifications, reviews, and price, then the 'vibe' of a brand matters less than its verifiable attributes. The battleground is shifting from visual persuasion to data accuracy.

The future of e-commerce is not about being seen by humans, but about being selected by agents.

This creates a massive challenge for established players. Shopify, for instance, must ensure its merchants are not just visible to humans, but also 'agent-friendly.' This might mean providing more structured data, better API access, and more transparent pricing models. The companies that survive this transition will be those that can bridge the gap between the emotional experience of human shopping and the logical requirements of machine-led procurement.

The New E-Commerce Realities

Shift from human-centric SEO to agent-centric data structures
Increased importance of verifiable product attributes
The potential decline of traditional brand 'vibe' in favor of utility
The need for platforms like Shopify to adapt to automated buyers

We are entering an era of 'headless' retail. The storefront is disappearing, replaced by a layer of intelligent intermediaries. For agency owners and brand builders, the task is no longer just about creating beautiful content; it is about ensuring that your brand's value proposition is mathematically clear to the algorithms that will soon be doing the shopping.

Key Takeaway

When agents shop, data becomes the new marketing.