The Simulation Gap: Why AI Struggles with the Real World
The bottleneck of progress isn't intelligence, but the ability to practice in a sandbox.
The current race in artificial intelligence is built on a massive bet. Labs are pouring billions into training models to complete millions of verifiable tasks across diverse environments. The logic is straightforward: if an AI can learn to solve open-ended problems in a thousand different simulated worlds, it will develop the general reasoning skills required for AGI. This approach assumes that current limitations, like data inefficiency, can be crushed by sheer scale, much like how massive compute solved many of the early hurdles in natural language processing. We are moving from models that predict the next word to models that act to achieve a goal.
The Verifiability Trap
However, there is a massive gap between solving a math problem and using a computer. Coding and mathematics are highly verifiable; you can run a piece of code, see if it works, and immediately know if the agent succeeded. This allows for 'grindable' training. You can run a thousand parallel agents in a software container, each trying to fix a bug, and the feedback loop is instant and deterministic. This is the engine of progress for coding models. But the real world is not a clean container. It is messy, unpredictable, and most importantly, it is not designed to be simulated at scale for training purposes.
It is not enough for a domain to be verifiable. It also has to be very grindable.
Consider the task of using a web browser. If you want to train an AI to book a holiday on Amazon or Expedia, you cannot simply run ten thousand parallel agents to try and break the checkout flow. Amazon will detect the bots, trigger captchas, or ban the accounts. Unlike a coding environment, the internet is a hostile, non-deterministic space. This creates a massive bottleneck. To make progress, we need high-fidelity, replayable simulators—digital twins of the world where an AI can fail ten million times without getting banned or breaking a real-world law. Currently, building these clones is too labour-intensive and unscalable.
- High-speed feedback loops
- Deterministic and replayable environments
- Scalable parallel rollouts
- Verifiable success metrics
This suggests that the next leap in AI won't just come from bigger models, but from better environments. If we can't build a simulator for winning a court case, running a business, or winning an election, then the AI will likely remain stuck in the realm of digital tasks. The ability to learn 'on the job' depends on whether we can provide a safe, repeatable playground for the intelligence to test its hypotheses. Until we solve the simulation problem, the most useful AI might remain confined to the very containers we've built for it.
AI progress is limited not by how much the model knows, but by how effectively it can practice in a controlled environment.