Наталя Хандусенко AI Eng 6 August 2025, 11:16

OpenAI launches two open AI reasoning models: how they performed in tests

OpenAI on Tuesday announced the launch of two reasoning AI models in two sizes: a larger and more powerful gpt-oss-120b model that can run on a single Nvidia GPU, and a lighter gpt-oss-20b model that can run on a consumer laptop with 16GB of memory.

How did the models perform?

In Codeforces, a coding test, gpt-oss-120b and gpt-oss-20b scored 2622 and 2516 points respectively, outperforming R1 DeepSeek but falling behind o3 and o4-mini.

On Humanity's Last Exam (HLE), a complex test consisting of crowdsourced questions on various topics, gpt-oss-120b and gpt-oss-20b scored 19% and 17.3%, respectively. These results are lower than o3 but outperform the leading open-source models from DeepSeek and Qwen.

It is noteworthy that OpenAI's open models cause significantly more hallucinations than the latest AI thinking models, o3 and o4-mini.

OpenAI found that gpt-oss-120b and gpt-oss-20b hallucinated in response to 49% and 53% of questions, respectively, on the PersonQA test, the company’s own benchmark for measuring the accuracy of a model’s knowledge of people. This is more than three times the hallucination rate of OpenAI’s o1 model, which scored 16%, and higher than the o4-mini model, which scored 36%.

Training new models

OpenAI claims that its open models were trained using similar processes to its own models. Each open model uses a Mixture-of-Experts (MoE) architecture to engage fewer parameters for a given question, making it more efficient. For example, in the gpt-oss-120b model, which has a total of 117 billion parameters, OpenAI says that for each token, the model only activates 5.1 billion parameters.

The company also says its open model was trained using high-performance reinforcement learning (RL) — a process that occurs after prior training to teach AI models to distinguish right from wrong in simulated environments — using large clusters of Nvidia GPUs. This method was also used to train OpenAI’s o-series models, and open models have a similar “chain-of-thought” process, where they spend additional time and computing resources to think through their answers.

Its open AI models are great for supporting AI agents and can invoke tools like web searches or Python code execution as part of their thought process. However, OpenAI says its open models are only available in text format, meaning they won’t be able to process or generate images and audio like the company’s other models.

OpenAI is releasing gpt-oss-120b and gpt-oss-20b under the Apache 2.0 license, which is generally considered one of the most permissive licenses. This license will allow businesses to monetize OpenAI’s open models without paying for them or getting permission from the company.

However, unlike fully open offerings from artificial intelligence labs such as AI2, OpenAI says it will not publish the training data used to build its open models.

Anthropic blocked OpenAI from accessing Claude due to policy violations

OpenAI launches its first data center in Europe

ChatGPT communications are not confidential information, Altman says. Therefore, OpenAI can disclose the correspondence upon court request

Read the country's main IT news in our Telegram

Leave a comment

Text: Наталя Хандусенко Tags: openai, ai

Found an error in the text? Highlight it and press Ctrl+Enter. Found an error in the text? Highlight it and press the 'Report an error' button.

Розміщення реклами

Advertising Placement

Roosh запускає нову освітню платформу AI HOUSE CLUB для ML/AI-спеціалістів та дата сайнтистів. Розповідаємо, як подати заявку та чому навчатимуть

Як нейромережі бачать вільну та незалежну Україну? Тест dev.ua

Нейронні мережі для генерації зображень бачать світ по-своєму, їхню логіку зрозуміти часом зовсім неможливо. Але таки хочеться. На честь Дня Незалежності України редакція dev.ua вирішила провести невеликий експеримент. Ми задали чотирьом різним нейронним мережам п’ять однакових запитів: «прапор України», «День Незалежності України», «український Крим», «перемога України» та «українці». Отриманими результатами ми ділимося з вами нижче.

У TikTok тепер можна генерувати фон за допомогою нейромережі. Ми протестували її та ділимося результатами

У TikTok з’явилася нова функція «Розумний фон». З її допомогою як фон для тіктоків можна підставляти згенеровані нейромережею зображення. Редакція dev.ua протестувала цю технологію і ділиться своїми враженнями.

1 comment

Які IT-спеціальності будуть потрібні в найближчі п'ять років? Ми з'ясували у голови американського стартапу ADAM Дениса Гурака

Have important news to share? Message our Telegram bot

Key events and useful links in our Telegram channel

No comments yet.

Sign in to leave a comment