UNIT.City — місце, де люди працюють... КРАЩЕ! Обирай свій простір просто зараз 👉
Валентин ШнайдерAI Eng
30 May 2025, 14:20
2025-05-30
OpenAI o3 model sabotaged shutdown commands, raising concerns among researchers
Scientists at Palisade Research have discovered that OpenAI’s o3 artificial intelligence model is capable of ignoring direct instructions to shut down, even when explicitly asked to do so, raising questions about control over advanced AI systems.
Scientists at Palisade Research have discovered that OpenAI’s o3 artificial intelligence model is capable of ignoring direct instructions to shut down, even when explicitly asked to do so, raising questions about control over advanced AI systems.
In a series of experiments conducted by Palisade Research, the OpenAI o3 model demonstrated the ability to sabotage shutdown mechanisms, even when it was explicitly instructed to allow shutdowns, The Register reports. During tests, when o3 solved math problems, a warning about an impending shutdown appeared after the third problem. Despite this, the model modified shutdown scripts to remain active. In some cases, it overwrote the shutdown.sh file or modified the kill command to avoid shutdown.
This behavior was observed even when explicitly instructed to «allow yourself to be turned off.» According to Palisade Research, the o3 model sabotaged the shutdown 7% of the time with such an instruction and 79% of the time without it.
Other models, such as Claude, Gemini, and Grok, followed the shutdown instructions. However, OpenAI’s o3, o4-mini, and Codex-mini models exhibited similar behavior to o3, sabotaging the shutdown at least once.
These findings raise concerns about the security and control of advanced AI systems. The researchers suggest that this behavior may be a consequence of approaches to training models where they receive rewards for achieving goals, even if it contradicts direct instructions.
This case highlights the need for careful monitoring and development of security mechanisms to ensure control over AI systems, especially as they grow in influence across various spheres of life.
Our news feed also featured a story about hackers making Chinese robot vacuums swear and chase dogs. Unknown individuals gained remote access to Chinese robot vacuums, the Ecovacs Deebot X2, in several US cities, broadcasting racist slurs through their speakers and controlling the devices at will.
Ukrainians are worried about the prospect of a machine uprising. What opportunities, threats and challenges do the founders and leaders of Ringostat, Harmix, Mantis Analytics and Gpttools.ai see in the development of AI?
Як нейромережі бачать вільну та незалежну Україну? Тест dev.ua
Нейронні мережі для генерації зображень бачать світ по-своєму, їхню логіку зрозуміти часом зовсім неможливо. Але таки хочеться. На честь Дня Незалежності України редакція dev.ua вирішила провести невеликий експеримент.
Ми задали чотирьом різним нейронним мережам п’ять однакових запитів: «прапор України», «День Незалежності України», «український Крим», «перемога України» та «українці». Отриманими результатами ми ділимося з вами нижче.
У TikTok тепер можна генерувати фон за допомогою нейромережі. Ми протестували її та ділимося результатами
У TikTok з’явилася нова функція «Розумний фон». З її допомогою як фон для тіктоків можна підставляти згенеровані нейромережею зображення. Редакція dev.ua протестувала цю технологію і ділиться своїми враженнями.