AI companies like OpenAI are facing unexpected delays and challenges as they push for bigger language models. To move past these hurdles, they’re turning to new training methods that help algorithms “think” more like humans.
Ilya Sutskever, who left OpenAI to start Safe Superintelligence (SSI), was once a big supporter of using massive data and computing power to improve AI. But now, he believes the focus on just making models bigger is no longer effective.
Ilya Sutskever:
“The 2010s were the age of scaling, now we’re back in the age of wonder and discovery once again. Everyone is looking for the next thing,” Sutskever said. “Scaling the right thing matters more now than ever.”
One of the biggest hurdles in AI development today is the cost and complexity of training large models. These models require millions of dollars and months of processing time, with no guarantee of success. Moreover, power shortages and a lack of easily accessible data add to the challenge.
In response, researchers are exploring a new technique called “test-time compute.” Instead of expanding the model, this method enhances AI during its use, allowing it to handle complex tasks like math and decision-making more effectively.
OpenAI’s new model, O1, uses this technique and can simulate human-like reasoning in problem-solving.
This shift could also change the AI hardware landscape. The demand for Nvidia’s chips, which dominate model training, could decrease as the focus moves to inference-based models.
Additionally, Investors are taking notice, as this change could impact billions in AI development.
As AI researchers embrace these new methods, the industry’s future looks set to be shaped by smarter, more efficient models.