9.4 C
New York

Meta AI’s Next-Gen Robots can Learn from Human Videos for Real-World Tasks!

Meta AI has announced two significant developments towards creating general-purpose embodied AI agents that will help to form the foundation for embodied intelligence.

The first development is the creation of an artificial visual cortex, called VC-1, that has been trained using a dataset of thousands of videos of people performing everyday tasks. The visual cortex is part of the brain that enables organisms to convert vision into movement. An artificial visual cortex is, therefore, a crucial requirement for any robot that needs to perform tasks based on what it sees. The second development is a technique known as “adaptive skill coordination,” in which robots are trained entirely in simulations, and those skills are then transferred to the real-world robot.

The VC-1 is a perception model that supports a wide range of sensorimotor abilities, environments, and impersonations. The researchers achieved this by training the model on videos of people performing everyday tasks from the Ego4D dataset created by Meta AI and their academic partners, as well as interactions in photorealistic simulated settings. VC-1 achieves impressive results on 17 different sensorimotor tasks in virtual environments, matching or outperforming the best-known previous results.

The second development is a technique known as “adaptive skill coordination,” in which robots are trained entirely in simulations, and those skills are then transferred to the real-world robot. Meta’s AI experts have collaborated with researchers at the Georgia Institute of Technology to develop this technique. ASC was tested on Spot, a robot designed by Boston Dynamics Inc., in environments built using indoor 3D scans of more than 1,000 homes. ASC achieved near-perfect performance, succeeding on 59 of 60 episodes.

Adaptive Skill Coordination (ASC) consists of three components that help an agent achieve long-horizon tasks and successfully adapt to changing environments: a library of basic sensorimotor skills, a skill coordination policy, and a corrective policy. Deployed on a Spot robot, this approach resulted in near-perfect performance (98% success rate) in rearrangement tasks across multiple real-world settings — a large jump from traditional baselines (73% success rate).

Meta is interested in artificially developing a visual cortex, a replica of the region of the brain that allows an organism to convert vision into a movement. For a robot to work completely autonomously in the real world, it must be capable of manipulating real-world objects based on what it sees and hears.

Meta’s researchers plan to integrate VC-1 with ASC to create a single system that gets closer to true embodied AI. To achieve this goal, Meta is open-sourcing the VC-1 model and sharing its detailed learnings on how to scale model size, dataset sizes, and more.

Read the paper: Adaptive Skill Coordination (ASC)

Read the paper: Visual Cortex

Meta in a Facebook post said:

“Today, we’re announcing two major advancements in our work toward general-purpose embodied AI agents that can help form the foundation for embodied intelligence.

Optimistic science fiction typically imagines a future where humans create art and pursue fulfilling pastimes while AI-enabled robots handle dull or dangerous tasks. But while we’re seeing the use of AI expand quickly in knowledge and creative tasks, robots aren’t yet doing our household chores. VC-1 and ASC by Meta AI researchers are taking a step toward robots that can better generalize from human videos & simulated interactions and apply those learnings to real-world tasks.

We are optimistic about how these advancements could one day serve as building blocks for AI-powered experiences where virtual assistants and physical robots can assist humans and interact seamlessly with both the virtual and physical world.”

Subscribe

Related articles

Top 7 Mobile App Development Mistakes and How to Avoid Them

Mobile app development brings many chances but also has...

Microsoft Patents Speech-to-Image Technology

Microsoft has just filed a patent for a game...

OpenAI’s Swarm Framework: AI Automation and Job Concerns

Swarm is the new experimental framework from OpenAI and...

Almost Half of All Fraud Attempts Now Use AI, New Data Reveals

As artificial intelligence (AI) advances, its use in fraud...

Author

editorialteam
editorialteam
If you wish to publish a sponsored article or like to get featured in our magazine please reach us at contact@alltechmagazine.com