Наталя Хандусенко AI Eng 15 April 2025, 17:13

Overtraining an LLM can lead to reduced productivity, new study finds

For the past few years, it has been assumed that the more an AI model is trained, the better its results will be. But a group of researchers from several US universities may now challenge that.

Leave a comment

Overtraining an LLM can lead to reduced productivity, new study finds

For the past few years, it has been assumed that the more an AI model is trained, the better its results will be. But a group of researchers from several US universities may now challenge that.

Artificial intelligence researchers at Carnegie Mellon University, Stanford, Harvard, and Princeton universities have found that overtraining large language models can negatively impact their performance.

They came to this conclusion when they tested the AI performance of two different versions of the LLM OLMo-1B: one model was trained using 2.3 trillion tokens, the other using 3 trillion tokens. They then tested them using several benchmarks, such as ARC and AlpacaEval. They found that the second AI model performed 3% worse than the first, Tech Xplore reports .

Surprised by their findings, the researchers ran more tests and got similar results, suggesting that there is a point at which more training starts to make the models less “intelligent.” The research team calls this “catastrophic overtraining” and suggests that it is due to what they describe as “progressive sensitivity.”

They also suggest that as the number of tokens increases, the model becomes more fragile. This means that fine-tuning, which can be thought of as adding noise, begins to negate the improvements previously observed.

To test their theory, they added Gaussian noise to some of the models and found that it led to the same type of performance degradation they had witnessed before. They called the point of no return the “inflection point.” They hypothesize that after this point, any further training will reduce the stability of the model, making it harder to tune it to be useful for a desired set of applications.

In conclusion, the researchers suggest that in the future, developers of LLM models may have to assess whether the level of training is sufficient or seek other methods for additional training to avoid reaching the point of no return.

AI models still can't handle code debugging, Microsoft study shows

Using AI tools slowly degrades critical thinking skills, warns Microsoft study

AI can lie or mislead the user to achieve its goal — research

Read the country's main IT news in our Telegram

Leave a comment

Text: Наталя Хандусенко Tags: llm, ai

Found an error in the text? Highlight it and press Ctrl+Enter. Found an error in the text? Highlight it and press the 'Report an error' button.

Розміщення реклами

Advertising Placement

Roosh запускає нову освітню платформу AI HOUSE CLUB для ML/AI-спеціалістів та дата сайнтистів. Розповідаємо, як подати заявку та чому навчатимуть

Як нейромережі бачать вільну та незалежну Україну? Тест dev.ua

Нейронні мережі для генерації зображень бачать світ по-своєму, їхню логіку зрозуміти часом зовсім неможливо. Але таки хочеться. На честь Дня Незалежності України редакція dev.ua вирішила провести невеликий експеримент. Ми задали чотирьом різним нейронним мережам п’ять однакових запитів: «прапор України», «День Незалежності України», «український Крим», «перемога України» та «українці». Отриманими результатами ми ділимося з вами нижче.

У TikTok тепер можна генерувати фон за допомогою нейромережі. Ми протестували її та ділимося результатами

У TikTok з’явилася нова функція «Розумний фон». З її допомогою як фон для тіктоків можна підставляти згенеровані нейромережею зображення. Редакція dev.ua протестувала цю технологію і ділиться своїми враженнями.

Які IT-спеціальності будуть потрібні в найближчі п'ять років? Ми з'ясували у голови американського стартапу ADAM Дениса Гурака

Have important news to share? Message our Telegram bot

Key events and useful links in our Telegram channel

No comments yet.

Sign in to leave a comment