AI researchers create competitor to OpenAI's o1 reasoning model for just $50
AI researchers from Stanford and the University of Washington were able to train an artificial intelligence «reasoning» model for less than $50 in cloud computing credits.
AI researchers from Stanford and the University of Washington were able to train an artificial intelligence «reasoning» model for less than $50 in cloud computing credits.
AI researchers from Stanford and the University of Washington were able to train an artificial intelligence «reasoning» model for less than $50 in cloud computing credits.
The scientists presented their model, called s1, in a research paper, and it is also available on GitHub, along with the data and code used to train it.
s1 performs similarly to state-of-the-art reasoning models, such as OpenAI’s o1 and DeepSeek’s R1, in tests measuring math and programming abilities, TechCrunch reports.
The s1 development team said they started with a ready-made base model and then refined it using distillation — the process of extracting «reasoning» capabilities from another artificial intelligence model by training on its responses.
The researchers say that s1 is based on one of Google’s reasoning models, the Gemini 2.0 Flash Thinking Experimental.
The researchers behind s1 were trying to find the simplest approach to achieving high reasoning performance and «scaling test time,» that is, allowing an AI model to think more before answering a question. These were some of the breakthroughs in OpenAI’s o1 that DeepSeek and other AI labs have tried to replicate using different methods.
The paper s1 suggests that reasoning models can be derived from a relatively small data set through a process called supervised fine-tuning (SFT), in which an AI model is explicitly instructed to imitate certain behaviors in the data set.
SFT is generally cheaper than the large-scale reinforcement learning method that DeepSeek used to train its competitor OpenAI’s o1 model, R1 .
Google offers free access to Gemini 2.0 Flash Thinking Experimental, albeit with a daily fee limit, through its Google AI Studio platform. However, Google’s terms prohibit reengineering its models to develop services that compete with the company’s own AI offerings.
S1 is based on a small, off-the-shelf AI model from Alibaba’s Chinese AI lab Qwen, which is available for free download. To train S1, the researchers created a dataset of 1,000 carefully selected questions, along with their answers, and the «thinking» process behind each answer from Google’s Gemini 2.0 Flash Thinking Experimental.
After training s1 in less than 30 minutes using 16 Nvidia H100 GPUs, s1 achieved high scores on certain AI tests. The researchers also told TechCrunch that the necessary computing machine can be rented today for about $20.
The researchers used a clever trick to get s1 to double-check his work and extend the «thinking» time: they told him to wait. Adding the word «wait» during s1’s reasoning helped the model get slightly more accurate answers, according to the paper.
Distillation has proven to be a good method for cheaply reproducing the capabilities of an AI model, but it does not allow for the creation of new AI models that are significantly better than those available today.



