Meta AI has announced two significant developments towards creating general-purpose embodied AI agents that will help to form the foundation for embodied intelligence.
The first development is the creation of an artificial visual cortex, called VC-1, that has been trained using a dataset of thousands of videos of people performing everyday tasks. The visual cortex is part of the brain that enables organisms to convert vision into movement. An artificial visual cortex is, therefore, a crucial requirement for any robot that needs to perform tasks based on what it sees. The second development is a technique known as “adaptive skill coordination,” in which robots are trained entirely in simulations, and those skills are then transferred to the real-world robot.
The VC-1 is a perception model that supports a wide range of sensorimotor abilities, environments, and impersonations. The researchers achieved this by training the model on videos of people performing everyday tasks from the Ego4D dataset created by Meta AI and their academic partners, as well as interactions in photorealistic simulated settings. VC-1 achieves impressive results on 17 different sensorimotor tasks in virtual environments, matching or outperforming the best-known previous results.
The second development is a technique known as “adaptive skill coordination,” in which robots are trained entirely in simulations, and those skills are then transferred to the real-world robot. Meta’s AI experts have collaborated with researchers at the Georgia Institute of Technology to develop this technique. ASC was tested on Spot, a robot designed by Boston Dynamics Inc., in environments built using indoor 3D scans of more than 1,000 homes. ASC achieved near-perfect performance, succeeding on 59 of 60 episodes.
Adaptive Skill Coordination (ASC) consists of three components that help an agent achieve long-horizon tasks and successfully adapt to changing environments: a library of basic sensorimotor skills, a skill coordination policy, and a corrective policy. Deployed on a Spot robot, this approach resulted in near-perfect performance (98% success rate) in rearrangement tasks across multiple real-world settings — a large jump from traditional baselines (73% success rate).
Meta is interested in artificially developing a visual cortex, a replica of the region of the brain that allows an organism to convert vision into a movement. For a robot to work completely autonomously in the real world, it must be capable of manipulating real-world objects based on what it sees and hears.
Meta’s researchers plan to integrate VC-1 with ASC to create a single system that gets closer to true embodied AI. To achieve this goal, Meta is open-sourcing the VC-1 model and sharing its detailed learnings on how to scale model size, dataset sizes, and more.
Read the paper: Adaptive Skill Coordination (ASC)
Read the paper: Visual Cortex
Meta in a Facebook post said: