Google's AI based Vision-Language-Action robot ready, capable of performing various tasks


In today's world of technology, many innovations are being discovered every day. Technology is currently developing beyond our thoughts. Google has demonstrated its first Vision-Language-Action (VLA) model for robot control, which can act on many common commands with robotic data and semantic and visual understanding.

What does this model do?

This can include interpreting new commands and acting on users' commands based on a priori logic, such as reasoning about object categories or high-level details.

What is Robotic Transformers 2?

Robotic Transformer 2 (RT-2) is a new vision-language-action (VLA) model that learns from both web and robotics data and translates this knowledge into generic instructions for robotic control. RT-2's flexible approach enables the robot to vary how it works its arms to pick up a cube, another toy, for example.

Work will understand the need of the person

According to one official, incorporating chain-of-thought reasoning allows the RT-2 to perform multi-stage semantic reasoning, such as determining which object can be used as an improvised hammer or which type of energy drink a tired person needs. to do

Successful after 6,000 robotic trials

The model is based on the latest robotic Transformer 1 that was trained on a multi-task demonstration. The team conducted a series of quantitative and qualitative experiments on RT-2 models over 6,000 robotic trials. As the RT-2 model demonstrates, vision-language models can be transformed into powerful vision-language-action models, which can directly control a robot by combining VLM pre-training with robotic data.

What is RT-2?

Google DeepMind's RT-2 is not just a simple and effective modification of existing VLM models, but demonstrates the creation of a general-purpose physical robot that can perform logic, problem solving, and information interpretation to perform a wide range of tasks in the modern world.

Comments

Popular posts from this blog

These special glasses of the Chinese police will identify criminals on sight

These countries are now farming in space, watch the video

GMT : A wonderful invention to remove the confusion of railways, a village with the time of which the world sets its clock!

Why is it necessary for a baby to cry at birth... This is the reason

More than 36 lakh WhatsApp accounts banned at the behest of the Indian government

Now launched the N95 Mask with Wireless Earphones connection

It was day on the moon but Vikram Lander did not wake up, is Chandrayaan-3 mission complete?

Fear of a terrible solar storm could damage the Internet on Earth

Despite being rejected 39 times, did not give up, eventually landed a job at Google

Like Telegram, Instagram also added a broadcast channel