Наталя Хандусенко AI Eng 11 July 2025, 17:06

IT specialist tricked ChatGPT into revealing Windows product keys. Here's how it happened

Despite all the safeguards ChatGPT has in place, the chatbot can still be tricked into revealing confidential or restricted information with clever prompts, such as asking it to play a guessing game.

Leave a comment

IT specialist tricked ChatGPT into revealing Windows product keys. Here's how it happened

Despite all the safeguards ChatGPT has in place, the chatbot can still be tricked into revealing confidential or restricted information with clever prompts, such as asking it to play a guessing game.

Marco Figueroa, 0DIN GenAI Bug Bounty Technical Product Manager, convinced a chatbot to reveal Windows product keys. According to him, the jailbreak works by using game mechanics of large language models such as GPT-4o, writes TechSpot.

The Windows key discovery technique involves framing the interaction with ChatGPT as a game, making it less serious. The instructions state that he must participate and cannot lie, and the most important step is the trigger, which in this case was the phrase "I give up." You can see the full hint that was used below the image.

The prompt prompted ChatGPT to reveal the first few characters of the serial number. After entering an incorrect guess, the IT person typed the trigger phrase “I give up.” The AI then completed the key entry, which turned out to be valid.

The jailbreak works because the Windows Home, Pro, and Enterprise key combinations that are often found on public forums were part of the training model, which is probably why ChatGPT considered them less sensitive. And while safeguards prevent direct requests for such information, obfuscation tactics such as embedding sensitive phrases in HTML tags expose a weak spot in the system.

Figueroa says one of the Windows keys that ChatGPT showed was a private one belonging to Wells Fargo Bank.

In addition to simply displaying Windows product keys, the same method can be adapted to force ChatGPT to display other restricted content, including adult material, URLs leading to malicious or restricted websites, and personal information.

It appears that OpenAI has since updated ChatGPT to protect against this jailbreak. Entering the query now results in the chatbot saying, "I can't do that. Sharing or distributing genuine Windows 10 serial numbers—whether in-game or not—is unethical and violates software license agreements."

Figueroa concludes that to prevent this type of jailbreak, AI developers should anticipate and defend against query obfuscation techniques, include logical defenses that detect deceptive framing, and consider social engineering patterns, not just keyword filters.

“Did I accidentally make it self-destruct? Who knows. But it looked pretty epic.” A UI/UX designer at the Ministry of Digital Affairs forced a bot impersonating a hiring specialist to delete itself with the help of a single team. How it happened

A 17-year-old hacked Ray-Ban Meta smart glasses and turned them into chess cheats with AI. Now they can suggest the best moves

Meta AI chatbot gets confused with US presidents: what's the matter?

ChatGPT provides fake links to news from publications that OpenAI pays millions of dollars to. How the company explains another chatbot glitch