Реклама партнера — Название партнёра
UNIT.City — місце, де люди працюють... КРАЩЕ! Обирай свій простір просто зараз 👉

Microsoft created a fake marketplace to test AI agents that unexpectedly failed

Microsoft and Arizona State University conducted a study that showed that current agent models can be vulnerable to manipulation. To do this, the researchers created a new simulation environment called the Magentic Marketplace to test how well AI agents can perform unsupervised.

Leave a comment
Microsoft created a fake marketplace to test AI agents that unexpectedly failed

Microsoft and Arizona State University conducted a study that showed that current agent models can be vulnerable to manipulation. To do this, the researchers created a new simulation environment called the Magentic Marketplace to test how well AI agents can perform unsupervised.

The team's experiments included 100 customer-side agents interacting with 300 business-side agents, TechCrunch reports .

Since the marketplace's source code is open, other research groups can use it for new experiments or to confirm the results obtained.

Ese Kamar, managing director of the AI ​​Frontiers Lab at Microsoft Research, says that research like this will be critical to understanding the capabilities of AI agents. “It’s a really big question: how will the world change when these agents start collaborating, communicating, and negotiating with each other? Our challenge is to understand that in a deep way.”

Initial analysis of the leading models—GPT-4o, GPT-5, and Gemini-2.5-Flash—revealed a number of unexpected flaws. In particular, the researchers found several manipulation techniques that businesses can use to get customer agents to buy their products. They found a significant decrease in agent performance when faced with a large number of choices, literally overloading their attention.

Additionally, the agents failed to work together to achieve a goal, demonstrating uncertainty about the division of roles within the team. While performance improved after giving the models more detailed instructions on how to collaborate, the researchers still emphasize that the basic abilities of these models need significant improvement.

Google's AI mode has new agent capabilities: it will book tickets for events and sign up for beauty treatments
Google's AI mode has new agent capabilities: it will book tickets for events and sign up for beauty treatments
On the topic
Google's AI mode has new agent capabilities: it will book tickets for events and sign up for beauty treatments
OpenAI Introduces GPT-5-Based Cybersecurity AI Agent: How Aardvark Protects Code from Vulnerabilities
OpenAI Introduces GPT-5-Based Cybersecurity AI Agent: How Aardvark Protects Code from Vulnerabilities
On the topic
OpenAI Introduces GPT-5-Based Cybersecurity AI Agent: How Aardvark Protects Code from Vulnerabilities
GitHub launches Agent HQ platform with AI agents for programming
GitHub launches Agent HQ platform with AI agents for programming
On the topic
GitHub launches Agent HQ platform with AI agents for programming
Read the country's main IT news in our Telegram
Read the country's main IT news in our Telegram
On the topic
Read the country's main IT news in our Telegram
Also Read
Call of Duty не зникне з PlayStation раптово. Sony отримала таку гарантію під Microsoft, яка купує розробника гри – Activision Blizzard
Call of Duty не зникне з PlayStation раптово. Sony отримала таку гарантію під Microsoft, яка купує розробника гри – Activision Blizzard
Call of Duty не зникне з PlayStation раптово. Sony отримала таку гарантію під Microsoft, яка купує розробника гри – Activision Blizzard
У Microsoft визнали, що PS4 продаються у два рази краще, ніж Xbox
У Microsoft визнали, що PS4 продаються у два рази краще, ніж Xbox
У Microsoft визнали, що PS4 продаються у два рази краще, ніж Xbox
Microsft опубліковала аналіз кібервійни в Україні та розкрила, які групи хакерів пов’язані з ФСБ, ГРУ
Microsft опубліковала аналіз кібервійни в Україні та розкрила, які групи хакерів пов’язані з ФСБ, ГРУ
Microsft опубліковала аналіз кібервійни в Україні та розкрила, які групи хакерів пов’язані з ФСБ, ГРУ
Meta, Microsoft та інші великі компанії об'єдналися задля створення стандартів для метавсесвітів
Meta, Microsoft та інші великі компанії об'єдналися задля створення стандартів для метавсесвітів
Meta, Microsoft та інші великі компанії об'єдналися задля створення стандартів для метавсесвітів

Have important news to share? Message our Telegram bot

Key events and useful links in our Telegram channel

Discussion
No comments yet.