Meta AI's Next-Gen Robots Can Learn From Human Videos For Real-World Tasks!

Meta new social network — Image Credit; Dima Solomin @Unsplash

Meta AI has announced two significant developments towards creating general-purpose embodied AI agents that will help to form the foundation for embodied intelligence.

The first development is the creation of an artificial visual cortex, called VC-1, that has been trained using a dataset of thousands of videos of people performing everyday tasks. The visual cortex is part of the brain that enables organisms to convert vision into movement. An artificial visual cortex is, therefore, a crucial requirement for any robot that needs to perform tasks based on what it sees. The second development is a technique known as “adaptive skill coordination,” in which robots are trained entirely in simulations, and those skills are then transferred to the real-world robot.

The VC-1 is a perception model that supports a wide range of sensorimotor abilities, environments, and impersonations. The researchers achieved this by training the model on videos of people performing everyday tasks from the Ego4D dataset created by Meta AI and their academic partners, as well as interactions in photorealistic simulated settings. VC-1 achieves impressive results on 17 different sensorimotor tasks in virtual environments, matching or outperforming the best-known previous results.

The second development is a technique known as “adaptive skill coordination,” in which robots are trained entirely in simulations, and those skills are then transferred to the real-world robot. Meta’s AI experts have collaborated with researchers at the Georgia Institute of Technology to develop this technique. ASC was tested on Spot, a robot designed by Boston Dynamics Inc., in environments built using indoor 3D scans of more than 1,000 homes. ASC achieved near-perfect performance, succeeding on 59 of 60 episodes.

Adaptive Skill Coordination (ASC) consists of three components that help an agent achieve long-horizon tasks and successfully adapt to changing environments: a library of basic sensorimotor skills, a skill coordination policy, and a corrective policy. Deployed on a Spot robot, this approach resulted in near-perfect performance (98% success rate) in rearrangement tasks across multiple real-world settings — a large jump from traditional baselines (73% success rate).

Meta is interested in artificially developing a visual cortex, a replica of the region of the brain that allows an organism to convert vision into a movement. For a robot to work completely autonomously in the real world, it must be capable of manipulating real-world objects based on what it sees and hears.

Meta’s researchers plan to integrate VC-1 with ASC to create a single system that gets closer to true embodied AI. To achieve this goal, Meta is open-sourcing the VC-1 model and sharing its detailed learnings on how to scale model size, dataset sizes, and more.

Read the paper: Adaptive Skill Coordination (ASC)

Read the paper: Visual Cortex

Meta in a Facebook post said:

“Today, we’re announcing two major advancements in our work toward general-purpose embodied AI agents that can help form the foundation for embodied intelligence.

Optimistic science fiction typically imagines a future where humans create art and pursue fulfilling pastimes while AI-enabled robots handle dull or dangerous tasks. But while we’re seeing the use of AI expand quickly in knowledge and creative tasks, robots aren’t yet doing our household chores. VC-1 and ASC by Meta AI researchers are taking a step toward robots that can better generalize from human videos & simulated interactions and apply those learnings to real-world tasks.

We are optimistic about how these advancements could one day serve as building blocks for AI-powered experiences where virtual assistants and physical robots can assist humans and interact seamlessly with both the virtual and physical world.”

Meta AI’s Next-Gen Robots can Learn from Human Videos for Real-World Tasks!

Meta AI’s Next-Gen Robots can Learn from Human Videos for Real-World Tasks!

Subscribe

Related articles

About us

Quick Links

Latest

Subscribe