Just "Wait"

Maybe the secret to AGI is asking "you sure about that?"
This team from Stanford was able to take an open model that can run on your laptop to o1-preview level math performance (SOTA Sep. 24) by injecting "Wait" into the LLM response and having it run longer.
DeepSeek r1 is exciting but misses OpenAI’s test-time scaling plot and needs lots of data.
— Niklas Muennighoff (@Muennighoff) February 3, 2025
We introduce s1 reproducing o1-preview scaling & performance with just 1K samples & a simple test-time intervention.
📜https://t.co/wmAvNwmrk2 pic.twitter.com/YPKUHNsmQU
Member discussion