Наталя Хандусенко AI Eng 3 January 2026, 15:08

DeepSeek has found a new approach to training LLM models that could revolutionize the AI market again

A team of researchers from Chinese AI company DeepSeek recently published a paper describing a method called Manifold-Constrained Hyper-Connections, or mHC for short. This will allow developers to build powerful language models while significantly saving on computational resources that were previously considered indispensable for such scales.

Problem

LLMs are built on neural networks, which in turn are designed to store signals across multiple layers. The problem is that the more layers you add, the more the signal can become attenuated or degraded, and the greater the risk of it turning into noise. It’s a bit like playing telephone: the more people you add, the higher the chance that the original message will be obfuscated and altered.

So the main challenge is to create models that can preserve signal strength in as many layers as possible — or, as the DeepSeek researchers put it in their new paper, “better optimize the trade-off between plasticity and stability.”

Decision

The authors of the new paper — including DeepSeek CEO Liang Wenfeng — relied on the concept of “hyperlinks” (HCs). This structure was proposed in 2024 by researchers at ByteDance to diversify the channels through which layers of a neural network exchange information with each other. However, hyperlinks create the risk of losing the original signal. In addition, they require significant memory costs, which makes them difficult to implement on a large scale.

The mHC architecture aims to address this issue by limiting the number of hyperlinks in the model, thereby preserving the information complexity provided by HC while avoiding the memory problem. This, in turn, can allow training of very complex models in a way that is practical and scalable even for developers with fewer resources.

Why is this important?

As with the R1 release in January 2025, the debut of the mHC framework could hint at a new direction in AI evolution.

Until now, the AI race has been dominated by the notion that only the biggest and richest companies can afford to build cutting-edge models. But DeepSeek consistently demonstrates that workarounds are possible, and that breakthroughs can only be achieved through smart engineering.

The fact that the company has published its new research on the mHC method means that it could be widely adopted by smaller developers, especially if it is used in the long-awaited R2 model (the release date of which has not been officially announced).

DeepSeek introduced a new AI model V3.1-Exp, which it called "an intermediate step towards the next generation architecture"

DeepSeek now labels all AI-generated content and these labels cannot be removed

Read the country's main IT news in our Telegram

Leave a comment

Text: Наталя Хандусенко Photo: Boston University Tags: ai, deepseek

Found an error in the text? Highlight it and press Ctrl+Enter. Found an error in the text? Highlight it and press the 'Report an error' button.

Розміщення реклами

Advertising Placement

Roosh запускає нову освітню платформу AI HOUSE CLUB для ML/AI-спеціалістів та дата сайнтистів. Розповідаємо, як подати заявку та чому навчатимуть

Як нейромережі бачать вільну та незалежну Україну? Тест dev.ua

Нейронні мережі для генерації зображень бачать світ по-своєму, їхню логіку зрозуміти часом зовсім неможливо. Але таки хочеться. На честь Дня Незалежності України редакція dev.ua вирішила провести невеликий експеримент. Ми задали чотирьом різним нейронним мережам п’ять однакових запитів: «прапор України», «День Незалежності України», «український Крим», «перемога України» та «українці». Отриманими результатами ми ділимося з вами нижче.

У TikTok тепер можна генерувати фон за допомогою нейромережі. Ми протестували її та ділимося результатами

У TikTok з’явилася нова функція «Розумний фон». З її допомогою як фон для тіктоків можна підставляти згенеровані нейромережею зображення. Редакція dev.ua протестувала цю технологію і ділиться своїми враженнями.

1 comment

Які IT-спеціальності будуть потрібні в найближчі п'ять років? Ми з'ясували у голови американського стартапу ADAM Дениса Гурака

Have important news to share? Message our Telegram bot

Key events and useful links in our Telegram channel

No comments yet.

Sign in to leave a comment

DeepSeek has found a new approach to training LLM models that could revolutionize the AI ​​market again

Problem

Decision

Why is this important?

DeepSeek has found a new approach to training LLM models that could revolutionize the AI market again