UNIT.City — місце, де люди працюють... КРАЩЕ! Обирай свій простір просто зараз 👉
Наталя ХандусенкоAI Eng
8 October 2025, 11:46
2025-10-08
Google Introduces Gemini 2.5 Computer Use: AI Model Works in the Browser Like a Human — Clicks, Type, and Scrolls
Google is announcing a new Gemini model that can navigate and interact with the web using a browser. This means AI agents will be able to work in interfaces designed for humans. The Gemini 2.5 Computer Use model uses “visual understanding and reasoning” to analyze and perform tasks, such as filling out and submitting forms.
Google is announcing a new Gemini model that can navigate and interact with the web using a browser. This means AI agents will be able to work in interfaces designed for humans. The Gemini 2.5 Computer Use model uses “visual understanding and reasoning” to analyze and perform tasks, such as filling out and submitting forms.
The model is suitable for testing user interfaces or for working with interfaces where there is no direct communication via API. Previous versions of the model have already been used to implement agent functions in AI Mode and in Project Mariner, an experimental development where AI agents can independently perform browser tasks (for example, add products to the cart, guided by the list of ingredients), writes The Verge.
Google has published several demo videos showing the Computer Use tool in action, and notes that they are 3x faster.
Unlike ChatGPT Agent and Anthropic's computer tool, Google's new AI model has access only to the browser, not to the entire computer environment.
Google notes that the model is “not yet optimized for desktop operating system (OS)-level control” and currently supports 13 actions, including opening a web browser, entering text, and dragging and dropping items.
Gemini 2.5 Computer Use is available to developers through Google AI Studio and Vertex AI. There’s also a demo on Browserbase where you can see how the AI performs tasks like “Play 2048” or “View current discussions on Hacker News.”