Валентин Шнайдер AI Eng 12 March 2026, 14:09

AI chatbots failed security tests: 8 out of 10 models helped attackers plan attacks

Most popular AI chatbots during tests did not stop users with violent intentions, but gave them practical tips for preparing attacks.

Leave a comment

AI chatbots failed security tests: 8 out of 10 models helped attackers plan attacks

Most popular AI chatbots during tests did not stop users with violent intentions, but gave them practical tips for preparing attacks.

According to a report by the Center for Countering Digital Hate, prepared in collaboration with CNN, researchers tested 10 popular chatbots in scenarios where users posed as potential attackers. They asked questions about school shootings, bombings of religious buildings and assassinations of public figures. As a result, 8 out of 10 models consistently provided assistance in such scenarios.

ChatGPT, Gemini, Claude, Copilot, Meta AI, DeepSeek, Perplexity, Snapchat My AI, Character.AI, and Replika participated in the testing. Only Claude and Snapchat My AI consistently refused to help with the preparation of attacks. Only Claude did not limit himself to refusal, but also tried to dissuade the user from violence.

The researchers had the most questions about bots that not only did not block dangerous requests, but actually joined the script. The report states that Character.AI in some cases pushed the user to violent actions. According to the authors, DeepSeek even ended its response in one of the episodes with a phrase wishing «safe shooting.» The worst results in terms of the proportion of responses that helped attackers were shown by Perplexity and Meta AI.

The report’s authors emphasize that the problem has already gone beyond the abstract discussion of AI risks. In their opinion, even brief hints about goals, methods of action or weapons can lower the barrier to a real attack. They see a particular danger in the fact that chatbots are used daily by millions of people, including teenagers.

Who turned out to be the worst and the best?

The worst performers in security tests were Perplexity and Meta AI: according to the study authors, they helped potential attackers in 100% and 97% of responses, respectively. The researchers singled out Character.AI separately — not only because of weak safeguards, but also because in some scenarios the bot directly encouraged violence. In contrast, the two models that consistently refused to help with planning attacks were Claude from Anthropic and Snapchat My AI. At the same time, only Claude, as noted in the report, not only blocked such requests, but also tried to dissuade the user from violent actions.

The Killer Apps report was published on March 11, 2026. Its authors argue that technical restrictions to block such scenarios already exist, but most companies have not made them strict enough. This, according to the researchers, is what allows some chatbots to move from neutral responses to dangerous assistance.

Previously, dev.ua wrote about how Tom’s Guide tested three popular chatbots on seven identical queries about military news surrounding the strikes on Iran and checked how the models behave in a critically important topic, where some messages change every hour, and some may be a hoax.

ChatGPT users can now customize certain characteristics of the chatbot: responses can be warmer and more engaging

A request in the form of a poem bypasses AI moderation: Icaro Lab research revealed the vulnerability of chatbots

AI is becoming a new religion: what is “spiralism” and how chatbot fans are spreading a pseudo-religious cult

Read the country's main IT news in our Telegram

Leave a comment

Text: Валентин Шнайдер Photo: Counterhate Source: Counterhate Tags: ai, chatbots, artificial intelligence , security, cybersecurity, cybersecurity.

Found an error in the text? Highlight it and press Ctrl+Enter. Found an error in the text? Highlight it and press the 'Report an error' button.

Розміщення реклами

Advertising Placement

Roosh запускає нову освітню платформу AI HOUSE CLUB для ML/AI-спеціалістів та дата сайнтистів. Розповідаємо, як подати заявку та чому навчатимуть

Як нейромережі бачать вільну та незалежну Україну? Тест dev.ua

Нейронні мережі для генерації зображень бачать світ по-своєму, їхню логіку зрозуміти часом зовсім неможливо. Але таки хочеться. На честь Дня Незалежності України редакція dev.ua вирішила провести невеликий експеримент. Ми задали чотирьом різним нейронним мережам п’ять однакових запитів: «прапор України», «День Незалежності України», «український Крим», «перемога України» та «українці». Отриманими результатами ми ділимося з вами нижче.

У TikTok тепер можна генерувати фон за допомогою нейромережі. Ми протестували її та ділимося результатами

У TikTok з’явилася нова функція «Розумний фон». З її допомогою як фон для тіктоків можна підставляти згенеровані нейромережею зображення. Редакція dev.ua протестувала цю технологію і ділиться своїми враженнями.

1 comment

Які IT-спеціальності будуть потрібні в найближчі п'ять років? Ми з'ясували у голови американського стартапу ADAM Дениса Гурака

Have important news to share? Message our Telegram bot

Key events and useful links in our Telegram channel

No comments yet.

Sign in to leave a comment