Олексій Дзюба Історії 27 January 2025, 16:59

The ChatGPT and Nvidia Killer. How China's DeepSeek Perfectly Applied the "Cheap and Angry" Principle to an AI Model, Shaking Up Global Markets

In recent days, everyone has been talking about the Chinese startup DeepSeek and its new free, open-source artificial intelligence model R1. The latter is literally rewriting the history of artificial intelligence right now. First, it outperformed OpenAI’s latest o1 model in several independent tests. Second, it helped DeepSeek become the top app on the App Store over the weekend. Third, it caused a drop in the stock prices of technology companies, including top graphics processor manufacturer Nvidia. For a detailed analysis of what helped DeepSeek outperform its competitors, read the material.

Who are DeepSeek?

DeepSeek was founded in 2023 by 40-year-old entrepreneur Liang Wenfeng. He is considered one of China’s leading investors. His hedge fund High-Flyer is funding DeepSeek’s AI research. The company itself is based in the city of Hangzhou, located on the country’s east coast, in Zhejiang province.

Hangzhou is one of China’s leading technology centers. It is home to the headquarters of the technology corporation Alibaba Group. The city also has numerous research institutes and universities. For example, Zhejiang University is one of the best technical universities in China.

Liang Wenfeng has a background in computer science and artificial intelligence. Before founding DeepSeek, he worked at leading technology companies in China, where he conducted research in machine learning and natural language processing.

In one of his few interviews, the founder of DeepSeek warned his competitor, OpenAI: «In the face of breakthrough technologies, the barriers created by closed source are temporary. Even OpenAI’s closed approach cannot prevent others from catching up.» His words were not empty.

What revolution in AI is DeepSeek making?

In late 2024, DeepSeek introduced its new open AI model — V3, which works well with code, but is not very willing to answer questions about China and its history. A month later, on the 20th of January, the company introduced a new version of the AI — R1. The developers claim that it is not inferior to the «thoughtful» o1 model from OpenAI in terms of performance and affordability. DeepSeek R-1 is based on the large base model DeepSeek-V3.

DeepSeek-R1, like OpenAI’s o1, was trained using reinforcement learning (RL), but the Chinese company says it also applied «supervised fine-tuning» to tackle complex reasoning tasks and match o1’s performance.

To demonstrate the benefits of its approach, DeepSeek used R1 to distill six Llama and Qwen models, taking their performance to a new level. In one case, the distilled version of Qwen-1.5B outperformed much larger models, GPT-4o and Claude 3.5 Sonnet, on separate mathematical tests. These models, like the main R1, were developed in open source and are available on Hugging Face under a license from the Massachusetts Institute of Technology.

During testing, DeepSeek-R1 scored 79,8% on the AIME 2024 math test and 97,3% on the MATH-500 test. It also scored 2,029 on Codeforces, beating 96,3% of human programmers. On these tests, the o1-1217 version scored 79,2%, 96,4%, and 96,6%, respectively. On the MMLU general knowledge test, R1 fell slightly behind, with an accuracy of 90,8% compared to o1's 91,8%.

The effectiveness of DeepSeek-R1 is being hailed as a major achievement for the Chinese startup in the AI space, which is currently dominated by companies from the United States. In addition, DeepSeek operates on an open source model and even provides access to educational materials.

Another advantage of DeepSeek for users is its pricing policy. OpenAI provides access to o1 at a price of $15 per million input tokens and $60 per million output tokens. In contrast, DeepSeek Reasoner, based on the R1 model, costs $0.55 per million input tokens and $2.19 per million output tokens.

Chinese AI app DeepSeek overtakes ChatGPT in the App Store

How a Chinese company circumvented US sanctions

Chinese AI companies are facing restrictions in the form of tightening U.S. export controls on advanced chips. But rather than weakening China’s AI capabilities, the sanctions appear to be spurring startups like DeepSeek to innovate.

Throughout 2021, Liang Wenfeng began buying thousands of Nvidia GPUs for one of his AI projects. Industry insiders dismissed it as the eccentric actions of a billionaire looking for a new hobby. Instead, the Chinese entrepreneur said he wanted to build something that would change the rules of the game.

Chinese media reports say the company has more than 10,000 units in stock, but Dylan Patel, founder of SemiAnalysis, an artificial intelligence research consultancy, puts the number at 50,000. Recognizing the potential of this stockpile for AI training is what prompted Liang to create DeepSeek, which was able to use them in conjunction with lower-powered chips to develop its models.

One former DeepSeek employee said that in order to create R1, the startup had to redesign its training process. This was necessary to reduce the load on GPUs, the power of which Nvidia had reduced for the Chinese market at the request of the United States.

Dimitris Papailiopoulos, principal scientist at Microsoft’s AI Frontiers research lab, says what surprised him most about R1 was its engineering simplicity. «DeepSeek focuses on getting precise answers rather than detailing every logical step, significantly reducing computation time while maintaining a high level of efficiency,» he says.

On US technology sanctions against China

In the summer of 2023, US President Joe Biden signed an executive order blocking and regulating investments by US companies in Chinese technology companies. This applies to semiconductors, microelectronics, quantum information technology and artificial intelligence.

This forced Nvidia to release a special RTX 4090D graphics card, a slower version of its flagship graphics card for the Chinese market. It had 12,8% fewer CUDA cores to comply with US sanctions restrictions.

Earlier this year, it became known that Nvidia managed to reduce the performance of the Chinese version of the RTX 5090D graphics card without changing the technical specifications. Like the previous generation of Nvidia graphics cards, the next-gen versions of the GPU will have a special version for China, which will have limited power, in particular in AI tasks, due to US sanctions. Interestingly, this time, despite the reduced performance, the RTX 5090D has identical specifications to the full version of the graphics card.

Why DeepSeek caused stocks to plummet around the world

Now the company claims that the DeepSeek R1 model was developed in just two months and cost only $6 million. It allegedly used the power of Nvidia H800 chips, which were limited by US sanctions against China. Despite the fact that the H800 is not the most modern development of Nvidia, DeepSeek managed to create a model that competes with the most powerful AI models in the US.

If the information about DeepSeek is completely true, it raises doubts among investors about the future significant spending on artificial intelligence infrastructure. You can do it «cheap and angry.»

Running DeepSeek-R1 queries costs just $0.14 per million tokens, making it 98% cheaper than OpenAI, which costs $7.5 at the time of writing. The difference is more pronounced in terms of profitability. DeepSeek spends $2.19 per million tokens, while OpenAI spends $60 per million tokens.

The market has already reacted to the cheaper AI alternative. The first to feel the fall due to DeepSeek were Japanese companies associated with American AI companies. Shares of Advantest, which supplies equipment to Nvidia, fell by 7,99%, Tokyo Electron — by almost 4%, Softbank Group (which owns chip developer Arm) — by 5,4%, Furukawa — by 9,8%, Fujikura — by 8%.

Nvidia shares, a big beneficiary of the AI revolution, fell 9% in premarket trading and then 12% later. Other tech giants, including Microsoft and Meta, lost about 4%. The Nasdaq lost 3,6%, while the S&P 500 fell 2,2%. The decline in stock prices was not limited to the US.

European chipmaker ASML fell 9,7%, leading the Stoxx Europe 600 Technology index to fall 4,8%. Furukawa Electric, which supplies cables for data centers, fell more than 11,3%, leading the decline on Tokyo’s Nikkei 225 index. Crypto also fell, with bitcoin and smaller tokens falling along with U.S. stock futures.

Bitcoin price drops below $100,000 after news of Chinese AI model DeepSeek launch and Trump's cryptocurrency decree

What is the reaction of OpenAI and Meta?

According to a white paper published last year by the China Academy of Information and Communications Technology, a state-run research institute, the number of large-scale AI language models worldwide has reached 1,328. Of these, 36% come from China. This places China as the second-largest contributor to AI after the United States.

«We have to take Chinese developments very seriously,» Satya Nadella, CEO of Microsoft, which is OpenAI’s largest investor, said at the Davos economic forum.

Meta plans to invest $60 billion in AI by 2025 amid concerns about Chinese startup DeepSeek. Its chief executive, Mark Zuckerberg, said his company would build a data center in Louisiana «the size of Manhattan Island.» He said 2025 would be a «definitive year for AI.»

According to Mark Zuckerberg, Meta AI will become a «leading assistant» serving over 1 billion users. «In 2025, we will bring online about 1 GW of computing power, and by the end of the year we will have over 1.3 million GPUs,» Zuckerberg said.

«OpenAI was founded 10 years ago, has 4,500 employees, and raised $6.6 billion. DeepSeek was founded less than 2 years ago, has 200 employees, and is worth less than $10 million. How are these two companies now competitors?» one of the experts logically asked on the social network X.

Meta plans to invest $60 billion in AI development by 2025 amid concerns about Chinese startup DeepSeek

Why DeepSeek cannot be fully trusted

Despite the hype surrounding R1, DeepSeek remains a relatively unknown company. It was founded in July 2023 by Liang Wenfeng. Like Sam Altman of OpenAI, Liang aims to create an artificial general intelligence (AGI) that can match or even surpass humans in a number of tasks. Whether the startup will raise additional funds for its work and from whom, who is it funded by, who currently makes up the core of its team — all these questions have not yet been answered. Just as there is no detailed information about Liang Wenfeng, the company’s founder.

Another issue is AI model censorship. DeepSeek, as a Chinese company, is subject to benchmarking by China’s internet regulator to ensure that its models’ responses «embody core socialist values.» Many Chinese AI systems refuse to respond to topics that could anger regulators, such as speculation about Xi Jinping’s regime, Taiwan, or the uprising in Tiananmen Square (site of the 1989 protests related to the democracy movement and the dispersal of protesters).

Also unknown are the details of DeepSeek’s development and what equipment it uses. There are statements from the company that are still impossible to verify. How much and what kind of capacity was used, how long it took to achieve the result, and how the company will continue to develop, having restrictions on importing capacity. Also, today we do not have enough independent AI research that would truly confirm the superiority of the Chinese company over AI models from OpenAI or Meta. And most importantly, what exactly are the settings in the technologies that helped DeepSeek become the main AI company at the moment, and perhaps in the future.

Chinese startup DeepSeek releases OpenAI's o1 AI model and offers 90% cheaper subscription

The Chinese have launched one of the most powerful open AI models, DeepSeek V3, which works well with code but is not very willing to answer questions about the developer's country.