Google's DeepMind AI enables robots to perform novel tasks

Sunday, July 30, 2023

Source: IANS

Google has demonstrated its first vision-language-action (VLA) model for robot control that showed improved generalization capabilities and semantic (the understanding of words and sentences), and visual understanding beyond the robotic data it was exposed to.

This includes interpreting new commands and responding to user commands by performing beginner-level reasoning, such as reasoning about object categories or high-level descriptions.

The Robotic Transformer 2 (RT-2) is a novel vision-language-action (VLA) model that learns from both web and robotics data, and translates this knowledge into generalized instructions for robotic control, according to Google DeepMind, the tech giant's artificial intelligence subsidiary.

A robot that can perform multiple tasks

A traditional robot can pick up a ball and stumble when picking up a cube. RT-2's flexible approach enables a robot to train on picking up a ball and can figure out how to adjust its extremities to pick up a cube or another toy it's never seen before.

"We also show that incorporating chain-of-thought reasoning allows RT-2 to perform multi-stage semantic reasoning, like deciding which object could be used as an improvised hammer (a rock), or which type of drink is best for a tired person (an energy drink),” said the team behind it.

The latest model builds upon Robotic Transformer 1 (RT-1) that was trained on multi-task demonstrations.

The team performed a series of qualitative and quantitative experiments on RT-2 models, on over 6,000 robotic trials.

The potential of vision-language models

The RT-2 model shows that vision-language models (VLMs) can be transformed into powerful vision-language-action (VLA) models, which can directly control a robot by combining VLM pre-training with robotic data.

"RT-2 is not only a simple and effective modification over existing VLM models, but also shows the promise of building a general-purpose physical robot that can reason, problem solve, and interpret information for performing a diverse range of tasks in the real -world,” said Google DeepMind.

Search This Blog

Today's Sciencology

Google's DeepMind AI enables robots to perform novel tasks

Comments

Post a Comment

Popular posts from this blog

APJ Abdul Kalam Punyatithi: Know, Dr. known as Missile Man. Some special things about Kalam's life ...

Sign one @, use many!

Hacking of the most powerful people in the world on Twitter - picture yet to come?

Two meteorites are feared to have hit the lunar surface after the Chinese rocket booster hit it.

Indian computers will now be ruled by indigenous operating systems, the magic of 'MAYA' will cover the Navy, Army and other forces.

For the first time, NASA is sending a drone helicopter to Mars

Another 'Udan' of ISRO successfully launched two Singaporean satellites

Google's surgical strike! 10 apps including Naukri.com and Shaadi.com will be removed from Play Store

The Earth's magnetic north pole is constantly changing, reaching Siberia from Canada in 120 years!

New difference in smartphone photography!