top of page

Beyond Tokens: How AI Learned to Feel the Weight of the World

Monday, 8 June 2026

Matthew Kenneth McDaid

A rain-slicked British street at dawn overlaid with a faint 3D wireframe twin

Today's chatbots read the world as words. A new kind of AI — the world model — is learning to feel how it moves, weathers and breaks. Here's why that changes your business.

Key takeaways

 

  • Today's chatbots read the world as words. They can describe rain; they can't feel that it makes a pavement slippery.
  • A new kind of AI — the world model — builds an internal, three-dimensional sense of how things move, fall, weather and break.
  • In 2026 this stopped being theory: World Labs shipped Marble, which turns a sentence into a navigable 3D world, and Britain published its first national digital-twin standard, BS EN 18162:2026.
  • The lesson for your business: machines are learning to reason about the real, physical world in structured states. If your business only exists as loose words, the next generation of AI can't see it. Structure it now.

 

 

The Mundane: a wet Tuesday outside Birmingham New Street

Step out of Birmingham New Street at half past seven on a wet Tuesday. The pavement is a sheet of grey light. Brake dust and diesel film have turned the kerb a greasy black. Across the road, a Victorian terrace wears a century of soot like a coat, and on its north-facing gable a soft green bloom of algae is creeping up the damp brick. A bus hisses to a stop. Everything here obeys laws — gravity, friction, the slow chemistry of weather on stone — and not one of them is written in words.

 

Now take out your phone and ask a chatbot about that scene. It will describe it beautifully. And it will not, in any real sense, know it. It has read the menu. It has never tasted the meal.

 

The Machine: why today's AI is reading, not living

The chatbots that amazed everyone work by chopping the world into tokens, little fragments of text, and predicting the next one. They are spectacular pattern-matchers. Ask one to trace the arc of a thrown ball and it will produce a perfect paragraph, because it has read ten thousand physics textbooks. But it has no internal simulator. It does not carry gravity, wind or the sudden stop of impact inside it. It knows how the words about a falling ball tend to follow each other. It does not know the fall.

 

This is the ceiling of the token mindset. These systems are brilliant at abstraction and hopeless at object permanence — the plain fact a toddler grasps, that a thing still exists when it rolls behind the sofa. They reason about descriptions of reality, not reality.

 

A different design is now climbing past that ceiling: the world model. Instead of strings of words, it builds a continuous, high-dimensional picture of an environment — a space where the position of a thing, its motion, its temperature and its material all live as related quantities the system can run forward in time. Researchers like Yoshua Bengio push this further with causal learning. Meta's JEPA approach trains models to predict the abstract state of a scene rather than every pixel.

 

In 2026 this left the lab. Fei-Fei Li launched World Labs, and its first commercial product, Marble, turns a line of text or a single photo into a continuous, walkable 3D world. The company calls the shift spatial intelligence, blending 3D Gaussian splatting for photorealistic surfaces with an underlying geometric mesh for the bones the physics hangs on. The internet's base layer is starting to move from flat text to streamable space — what some now call 3D as code.

 

The British street, now machine-readable

The British Standards Institution published BS EN 18162:2026 on 31 March 2026 — the country's first formal standard for digital twins in the built environment, building on the international digital-twin vocabulary of ISO/IEC 30173. Its own definition is precise: a digital twin is a virtual system that contains all relevant information about the target entity and is not a static representation, due to its synchronisation with its target entity over time. A living mirror, not a 3D model that sits still.

 

It splits twins into the Asset Digital Twin — the building itself — and the Process Digital Twin — the workflows that keep it alive. And it separates contextual data, the slow near-static facts, from dynamic data, the fast near-real-time stream. To a world model wired into a digital twin, the soot on the terrace and the algae on the gable are not blemishes. They are data — signals of decay on a timeline.

 

The Mindset: what this asks of you

Every genuine shift in technology is really a shift in how you have to think. The lesson here is this: the world is being rewritten into structured, machine-readable states — and what isn't structured becomes invisible.

 

For twenty-five years I worked the physical built environment before moving into its digital twin, and the pattern is identical in both. A business that lives only in its owner's head, scattered across loose webpages and gut feel, is invisible to an AI agent that reasons in verifiable states. And the detail most people skim past in that standard is the key: its whole purpose is to make a building's data FAIR — findable, accessible, interoperable and reusable — by structuring it as a semantic ontology built on open web standards like RDF and OWL. The official British way to digitise a building is to turn it into a structured ontology a machine can read. That is precisely the move your business needs — and precisely what I do.

 

Try this, this week

Walk into your workplace tomorrow and find three physical things a digital twin would already be tracking — the glazing, the door footfall, the heating response. Then ask the harder question: if a building can have a structured, machine-readable twin, what would yours look like for your business — and which parts of it currently exist only in your head?

 

Common questions

What is a world model in simple terms?

It is an AI that builds an internal, three-dimensional sense of how the world behaves instead of only predicting the next word. It can imagine what happens next before acting.

 

Is this different from ChatGPT?

Yes. Chatbots are language models: superb with words, but with no built-in grip on physical reality. World models like World Labs' Marble understand space, geometry and physics directly. The likely future is the two working together.

 

Why should a small UK business care now?

Because the customer is becoming an AI agent that reasons in structured facts. If your business data isn't organised and machine-readable, the agent can't evaluate or recommend you — no matter how good you actually are.

 

This article applies The Architect's Ontological Pivot method — moving from a mundane physical scene (a wet British street) to the underlying machine principle (world models and digital twins), then to a transferable business mindset (structure your business as a machine-readable ontology before an AI agent needs to read it). Every factual claim was checked against a primary or reputable source on 7 June 2026 under a strict anti-hallucination gate: no figure, date, citation or institution is stated unless verified, and any unconfirmed point is flagged rather than filled.

 

Leading figures in this field:

 

 

Organisations referenced:

 

  • World Labs — spatial-intelligence company (founders Fei-Fei Li, Justin Johnson, Christoph Lassner, Ben Mildenhall); maker of the Marble world model.
  • Meta AI — originator of the JEPA / I-JEPA joint-embedding predictive architecture.
  • British Standards Institution (BSI) — publisher of BS EN 18162:2026.

 

Verified facts (information gain):

 

  • World Labs released Marble, its first commercial world model, in November 2025 — TechCrunch, 12 Nov 2025.
  • BS EN 18162:2026 — the UK's first digital-twin standard for the built environment — was published by BSI on 31 March 2026 (CEN/TC 442; building on ISO/IEC 30173) — Industrialised Construction, 8 Apr 2026.
  • The standard prescribes structuring building data as FAIR (findable, accessible, interoperable, reusable) via a semantic ontology on open W3C standards (RDF, OWL).

 

bottom of page