“Smelly” configuration files cause AI agents to waste tokens — study
It turns out that instructions for coding AI agents can be just as smelly as risky or poorly structured code patterns, known as “code smells.”
It turns out that instructions for coding AI agents can be just as smelly as risky or poorly structured code patterns, known as “code smells.”
It turns out that instructions for coding AI agents can be just as smelly as risky or poorly structured code patterns, known as “code smells.”
Coding AI agents relies on configuration files that describe the expected behavior of the agent. These context-extending files are typically written in Markdown and are called CLAUDE.md (for those using Anthropic models) or AGENTS.md (for virtually all other models).
They contain various text instructions that tell the coding agent what behavior is desired and how to use the tools. And these instructions can get quite lengthy. Anthropic recommends not exceeding the 200-line limit, as longer files overload the model's context and can compromise the clarity and consistency of its responses.
Researchers from the Computer Science Department of the Federal Institute of Minas Gerais in Brazil recently analyzed about 532,000 files. The goal was to create and study a dataset of 100 popular open-source projects that contain CLAUDE.md or AGENTS.md files, The Register reports .
“Our results show that configuration smells are widespread,” the researchers say. “The most popular was linter leakage, affecting 62% of files. This was followed by context inflation at 42% and skill leakage at 35%.”
Linting is the process of running automated tools to check code for programming and stylistic errors. A “linter leak” occurs when instructions to an AI agent duplicate rules that are already automatically checked by linters, formatting checkers, and static analysis tools. Duplicate rules waste tokens by overloading the underlying model with instructions for tasks that the software already reliably performs.
“Context bloat,” as the name suggests, describes the tendency for developers to over-detail the behavior of an AI agent. “Bloated configuration files increase token consumption, increase costs, and distract the model from truly important instructions,” the study authors note, citing Anthropic’s recommendation to write no more than 200 lines of text.
“Skill leakage” is another common configuration smell. It occurs when tools or practices that are rarely used are added to the AGENTS.md file, which is loaded during each agent session. Such instructions are better placed in a separate skills file (such as SKILLs.md) that is loaded only when needed. Skill leakage also unnecessarily expands the agent’s context and potentially distracts it from other tasks.
Other AI agent "smells" include: "blind links," which occurs when configuration files reference external documents (e.g., via URLs) without explaining when that resource is needed; "Init Fossilization"—configuration details set at the time of project launch that are no longer relevant; and "conflicting instructions," which occur when an agent's directives contradict each other.
The study authors note that they detected at least one of these six "smells" in 91 out of 100 AGENTS.md files tested.
“These results suggest that developers would benefit from catalogs and tools designed to identify problems in AI agent configuration files,” the researchers conclude.
The main idea here is this: when it comes to configuration files for encoding agents, the principle of “less is more” applies. Perhaps even to the point where any configuration is worse than no configuration at all.



