Наталя Хандусенко AI Eng 26 March 2025, 12:28

Google has released a new AI model, the Gemini 2.5 Pro. The company claims that it is the "smartest" and outperforms its competitors in tests

Google has unveiled a new family of reasoning AI models, Gemini 2.5, that stop to “think” before giving answers. The company says the first version, Gemini 2.5 Pro Experimental, outperforms OpenAI, Anthropic, xAI, and DeepSeek in general AI tests that measure comprehension, math, coding, and other abilities.

Leave a comment

Google has released a new AI model, the Gemini 2.5 Pro. The company claims that it is the "smartest" and outperforms its competitors in tests

Google has unveiled a new family of reasoning AI models, Gemini 2.5, that stop to “think” before giving answers. The company says the first version, Gemini 2.5 Pro Experimental, outperforms OpenAI, Anthropic, xAI, and DeepSeek in general AI tests that measure comprehension, math, coding, and other abilities.

In his post on X, Google DeepMind CEO Demis Hassabis called Gemini 2.5 Pro “a stunning state-of-the-art model, #1 on LMArena with a whopping +39 ELO score, with significant improvements in multimodal reasoning, coding, and STEM.”

Gemini 2.5 Pro is an awesome state-of-the-art model, no.1 on LMArena by a whopping +39 ELO points, with significant improvements across the board in multimodal reasoning, coding & STEM. You can try it out now in AI Studio https://t.co/lLpF8ToTVJ & @GeminiApp with Gemini Advanced https://t.co/bgjabz8O1u
— Demis Hassabis (@demishassabis) March 25, 2025

Most notably, Gemini 2.5 Pro Experimental outperformed OpenAI o3 mini and Anthropic Claude 3.7 Sonnet on Humanity's Last Exam (HLE), a newly created test designed to combat saturation, or the problem of industry tests becoming too easy for rapidly evolving AI models. HLE is therefore a relatively more difficult test to master; Gemini 2.5 scored 18.8% compared to 14% for the o3 mini (which only assessed text-based tasks, no images) and 8.9% for the Claude 3.7 Sonnet, ZDNET reports .

The new model, which has already topped the Chatbot Arena leaderboard, also outperformed its competitors in general science, math, and coding tests, though typically by smaller margins, which is to be expected given the speed at which new models are accelerating. Google said that Gemini 2.5 Pro Experimental shows improvements in reasoning, multimodality, and agent-based capabilities, even with “single-line prompting.”

The video below shows how 2.5 Pro uses reasoning capabilities to program a video game based on a single clue.

Gemini 2.5 Pro is available with a one million token pop-up window for Gemini Advanced users via Google AI Studio and the Gemini app, and will also be “coming soon to Vertex AI.” The company added that it will release pricing information in the next few weeks.

Recall that Microsoft added AI-based deep research tools to Copilot — Researcher and Analyst.

Google launches Gemini AI features for real-time video

Google opens access to Audio Overview in Gemini: how audio transcription "brings" files to life

Gemini launches new programming feature: how Canvas will help make working with code easier

Read the country's main IT news in our Telegram

Leave a comment

Text: Наталя Хандусенко Tags: ai, artificial intelligence

Found an error in the text? Highlight it and press Ctrl+Enter. Found an error in the text? Highlight it and press the 'Report an error' button.

Розміщення реклами

Advertising Placement

Roosh запускає нову освітню платформу AI HOUSE CLUB для ML/AI-спеціалістів та дата сайнтистів. Розповідаємо, як подати заявку та чому навчатимуть

Як нейромережі бачать вільну та незалежну Україну? Тест dev.ua

Нейронні мережі для генерації зображень бачать світ по-своєму, їхню логіку зрозуміти часом зовсім неможливо. Але таки хочеться. На честь Дня Незалежності України редакція dev.ua вирішила провести невеликий експеримент. Ми задали чотирьом різним нейронним мережам п’ять однакових запитів: «прапор України», «День Незалежності України», «український Крим», «перемога України» та «українці». Отриманими результатами ми ділимося з вами нижче.

У TikTok тепер можна генерувати фон за допомогою нейромережі. Ми протестували її та ділимося результатами

У TikTok з’явилася нова функція «Розумний фон». З її допомогою як фон для тіктоків можна підставляти згенеровані нейромережею зображення. Редакція dev.ua протестувала цю технологію і ділиться своїми враженнями.

1 comment

Які IT-спеціальності будуть потрібні в найближчі п'ять років? Ми з'ясували у голови американського стартапу ADAM Дениса Гурака

Have important news to share? Message our Telegram bot

Key events and useful links in our Telegram channel

No comments yet.

Sign in to leave a comment