UNIT.City — місце, де люди працюють... КРАЩЕ! Обирай свій простір просто зараз 👉
Наталя ХандусенкоAI Eng
11 July 2025, 17:06
2025-07-11
IT specialist tricked ChatGPT into revealing Windows product keys. Here's how it happened
Despite all the safeguards ChatGPT has in place, the chatbot can still be tricked into revealing confidential or restricted information with clever prompts, such as asking it to play a guessing game.
Despite all the safeguards ChatGPT has in place, the chatbot can still be tricked into revealing confidential or restricted information with clever prompts, such as asking it to play a guessing game.
Marco Figueroa, 0DIN GenAI Bug Bounty Technical Product Manager, convinced a chatbot to reveal Windows product keys. According to him, the jailbreak works by using game mechanics of large language models such as GPT-4o, writes TechSpot.
The Windows key discovery technique involves framing the interaction with ChatGPT as a game, making it less serious. The instructions state that he must participate and cannot lie, and the most important step is the trigger, which in this case was the phrase "I give up." You can see the full hint that was used below the image.
The prompt prompted ChatGPT to reveal the first few characters of the serial number. After entering an incorrect guess, the IT person typed the trigger phrase “I give up.” The AI then completed the key entry, which turned out to be valid.
The jailbreak works because the Windows Home, Pro, and Enterprise key combinations that are often found on public forums were part of the training model, which is probably why ChatGPT considered them less sensitive. And while safeguards prevent direct requests for such information, obfuscation tactics such as embedding sensitive phrases in HTML tags expose a weak spot in the system.
Figueroa says one of the Windows keys that ChatGPT showed was a private one belonging to Wells Fargo Bank.
In addition to simply displaying Windows product keys, the same method can be adapted to force ChatGPT to display other restricted content, including adult material, URLs leading to malicious or restricted websites, and personal information.
It appears that OpenAI has since updated ChatGPT to protect against this jailbreak. Entering the query now results in the chatbot saying, "I can't do that. Sharing or distributing genuine Windows 10 serial numbers—whether in-game or not—is unethical and violates software license agreements."
Figueroa concludes that to prevent this type of jailbreak, AI developers should anticipate and defend against query obfuscation techniques, include logical defenses that detect deceptive framing, and consider social engineering patterns, not just keyword filters.
“Did I accidentally make it self-destruct? Who knows. But it looked pretty epic.” A UI/UX designer at the Ministry of Digital Affairs forced a bot posing as a recruitment specialist to delete itself with a single command. How it was done