Does AI need a “body” to become truly intelligent? Meta researchers think so.

We may be on the brink of finally seeing human-level intelligence in an AI — thanks to robots.

March 30, 2024

This article is an installment of Future Explored, a weekly guide to world-changing technology. You can get stories like this one straight to your inbox every Thursday morning by subscribing here.

AIs that can generate videos, quickly translate languages, or write new computer code could be world changing, but can they ever be truly intelligent?

Not according to the embodiment hypothesis, which argues that human-level intelligence can only emerge if an intelligence is able to sense and navigate a physical environment, the same way babies can.

“AI systems that lack a physical embodiment can never be truly intelligent.”
Akshara Rai

According to this theory, the only way to get an AI to develop true intelligence is to give it a body and the ability to move around and experience the world. Digital-only AIs, in comparison, may be great for narrow tasks, but they’ll always hit an intelligence ceiling.

“AI systems that lack a physical embodiment can never be truly intelligent,” Akshara Rai, a research scientist at Meta, told Freethink. “To fully comprehend the world, it is essential to interact with it and observe the outcomes of those interactions.”

Not all AI developers buy into the embodiment hypothesis — it may end up being possible to create a digital-only superintelligence that never feels the Earth beneath its robotic feet.

Many of those who do, though, are focused on figuring out the safest, most efficient way to let AIs explore the physical world — but simply dropping untrained AI “brains” into robot “bodies” is not it.

The (not so) real world

Babies make a lot of mistakes when they’re first learning to do something, and AI is likely to experience plenty of errors during training, too. If it’s controlling a machine when it makes those errors, it could destroy the hardware, damage the world around it, and maybe even hurt people.

An AI might also need to attempt something many times before figuring out how to do it reliably. Multiply that by all the slightly different tasks we might want an AI robot to be able to do, and the training period could become interminable.

Computer simulations that mimic the environments an embodied AI is likely to encounter in the real world are a way around both problems.

An in-development AI can be given control of a virtual body and then allowed to train in the computer program. This gives the AI a fast, low-risk way to learn what will likely happen when it’s in control of a real robot.

Because simulations don’t have to move at the speed of the real world, an AI can learn far more quickly, too — when MIT was training an AI-powered cheetah robot, for example, simulations allowed the AI to experience 100 days of running in just three hours.

“Habitat 3.0 enables virtual collaboration between robots and people.”
Roozbeh Mottaghi

In 2019, Meta (then Facebook) unveiled AI Habitat, an open-source simulation platform for training AIs to navigate homes, offices, and other spaces.

AI Habitat trains AIs to open doors, retrieve objects, and much else in a variety of environments. Hopefully, they’ll be smart enough to do those things right out of the box in places they’ve never seen before, which will be key to the deployment of robots in homes and workplaces.

In October 2023, Meta updated the platform — and this version brings human avatars into the simulated world.

“The first iteration in 2019 trained virtual robots how to navigate 3D scans of real world homes at rapid speeds, and Habitat 2.0 introduced interactive environments so the virtual robots could pick up objects or open drawers,” Roozbeh Mottaghi, an AI research scientist manager at Meta, told Freethink.

“Habitat 3.0 enables virtual collaboration between robots and people, where robots adapt to non-stationary environments and account for the actions, movements, and intents of humans,” he continued.

Limitations

The new AI Habitat is key to the development of mainstream embodied AI — for robots to be successfully integrated into our lives, they need to understand how to interact with us — but just how much an AI can learn about coexisting with people from simulations is debatable.

“Different humans have different perspectives, different goals, different capabilities … As humans, we cannot even write down the rules of the social norms in such a complex world,” Boyuan Chen, head of the General Robotics Lab at Duke University, told Freethink.

Humans aren’t the only ultra-complex variable in simulations, either.

The physics of the real world are also incredibly hard to simulate, which can lead to something robot developers call the “sim-to-real gap” — a phenomenon where an AI performs better in simulations than it does when given control of a physical body.

“Even simple things, like a bouncing ball … cannot be modeled very well in a simulation, not to mention very complex fluid dynamics, very complex combustion,” said Chen.

“To fully comprehend the world, it is essential to interact with it and observe the outcomes of those interactions.”
Akshara Rai

In the field

Continually improving our simulations can help close the sim-to-real gap, but eventually, there comes a point where the only option is to give an AI its body and a safe space to fail.

Embodied AI is now at the point that some companies are ready to take the next step and actually send their AI-powered bots into the real world, like parents dropping their kids off at school for the first time.

Robotics startup Agility Robotics has deployed its Digit robots at an Amazon R&D facility, while humanoid developer Apptronik is sending its Apollo robots to work at a Mercedes-Benz factory to validate that they can operate safely and effectively next to people.

Meanwhile, OpenAI-backed robotics startup Figure is deploying its AI humanoids at a BMW manufacturing plant.

@Figure_robot next to JaVale McGee (7’0 @NBA player)

Credit: JaVale McGee pic.twitter.com/uZQDHSzCba
— AtomsNotBits (@AtomsNotBits) March 28, 2024

Figure recently demoed an update to its robot’s brain with a vision-language model (VLM), trained by OpenAI, which means it can now understand speech, execute verbal commands, and even explain why it is doing what it is doing.

The robot might still be far less intelligent than a human, but OpenAI and Figure had only been working together for 13 days when they dropped the video, and it was a huge advance from the last Figure demo, released in February, in which the bot merely moved a tote from one place to another.

If the embodiment hypothesis is true, this pairing of a company developing some of the most advanced AI brains with one building state-of-the-art robot bodies could be the combination that leads to real human-like intelligence in an artificial being — and soon.

We’d love to hear from you! If you have a comment about this article or if you have a tip for a future Freethink story, please email us at [email protected].