Валентин Шнайдер AI Eng 26 September 2025, 14:33

Google Introduces Gemini Robotics 1.5 and Robotics-ER 1.5: AI Teaches Robots to Think, Plan, and Search for Information

The tech giant presented Gemini Robotics-ER 1.5 and Robotics 1.5, which work as a «brain and hands»: ER 1.5 builds a step-by-step plan, calls tools (including search) and transmits steps, while the VLA model perceives video/images and translates instructions into motor commands.

Leave a comment

Google Introduces Gemini Robotics 1.5 and Robotics-ER 1.5: AI Teaches Robots to Think, Plan, and Search for Information

The tech giant presented Gemini Robotics-ER 1.5 and Robotics 1.5, which work as a «brain and hands»: ER 1.5 builds a step-by-step plan, calls tools (including search) and transmits steps, while the VLA model perceives video/images and translates instructions into motor commands.

Google detailed the updates in a blog post. They are two complementary models for robotics: Gemini Robotics 1.5 (VLA, Vision-Language-Action Model) and Gemini Robotics-ER 1.5 (VLM, Embodied Reasoning Model). The former translates visual data and instructions into motor commands for the robot, «thinking before acting» and showing the reasoning process.

The second acts as a «high-level brain»: it plans missions, makes logical inferences in the physical environment, natively invokes digital tools (including Google Search), and breaks down complex tasks into step-by-step instructions.

What the new models can do. Robotics-ER 1.5 achieves SOTA scores on 15 academic benchmarks of spatial understanding and embodied reasoning. In practice, this means that the agent first looks for rules (for example, local garbage sorting norms), correlates them with what the camera «sees», and only then gives a sequence of steps to perform. Robotics 1.5 processes each step: from «putting a papercup in a recycling container» to a specific trajectory of the manipulator, while being able to explain why it chose these actions («conscious» planning, segmenting long missions into short subtasks).

Cross-platform learning. An important breakthrough is the transfer of skills between different «bodies»: movements learned on ALOHA 2 are practiced on both the Apptronik Apollo humanoid and the two-armed Franka without special «fitting» of the model to each robot. This speeds up the onboarding of new hardware platforms and reduces the time to useful applications.

The team emphasizes responsible implementation: from high-level semantic «think-safe-before-act» and alignment with Gemini safety policies to low-level collision avoidance subsystems on board the robot. The ASIMOV benchmark for semantic safety assessment has been updated: «rare» cases, question types, and video modalities have been expanded; Robotics-ER 1.5 demonstrates SOTA through improved «thinking».

Gemini Robotics-ER 1.5 is available to developers today via the Gemini API in Google AI Studio. Gemini Robotics 1.5 is currently available with select partners; the company promises to expand the program.

The Robotics line is based on the basic multimodal Gemini models that Google has been implementing in «physical» scenarios since the beginning of the year. In previous stages, the company demonstrated how agents understand instructions, video plot, and spatial relationships; version 1.5 adds transparent turn-based reasoning, mission planning, tool calling, and skill portability between different robots — things that systems that simply «respond to a command» lacked. For industry, this means creating universal robot assistants: from logistics and manufacturing to service, everyday life, and R&D laboratories.

Previously, dev.ua wrote about how Google officially brought its Gemini conversational AI to TVs with Google TV: first, support was received by TCL models of the QM9K series, but support will be expanded throughout the year.

Google integrated Gemini into Chrome: the browser received 10 new AI features

Gemini can transcribe audio and video into text, including the free version. An expert gave advice on how to use it

Google improves image generation in Gemini thanks to nano-banana AI model

Read the country's main IT news in our Telegram

Leave a comment

Text: Валентин Шнайдер Photo: Блог Google Source: Блог Google Tags: ai, artificial intelligence , gemini, google, robotics, robots

Found an error in the text? Highlight it and press Ctrl+Enter. Found an error in the text? Highlight it and press the 'Report an error' button.

Розміщення реклами

Advertising Placement

Roosh запускає нову освітню платформу AI HOUSE CLUB для ML/AI-спеціалістів та дата сайнтистів. Розповідаємо, як подати заявку та чому навчатимуть

Головоломка киянина Quadline перемогла на фестивалі інді-ігор Google Play

Як нейромережі бачать вільну та незалежну Україну? Тест dev.ua

Нейронні мережі для генерації зображень бачать світ по-своєму, їхню логіку зрозуміти часом зовсім неможливо. Але таки хочеться. На честь Дня Незалежності України редакція dev.ua вирішила провести невеликий експеримент. Ми задали чотирьом різним нейронним мережам п’ять однакових запитів: «прапор України», «День Незалежності України», «український Крим», «перемога України» та «українці». Отриманими результатами ми ділимося з вами нижче.

Харківська художниця намалювала новий дудл для Google на День Незалежності України

Have important news to share? Message our Telegram bot

Key events and useful links in our Telegram channel

No comments yet.

Sign in to leave a comment