AI Plateau? OpenAI Pivots to Smarter Training in Model Evolution

As per AI scientists, researchers and investors, they believe that these smarter techniques, which are behind OpenAI’s recently released o1 model, could reshape the AI arms race.

Artificial intelligence companies like OpenAI are seeking to overcome unexpected delays and challenges in the pursuit of smarter and large language models by developing training techniques that use more human-like ways for algorithms to “think”. As per AI scientists, researchers and investors, they believe that these smarter techniques, which are behind OpenAI’s recently released o1 model, could reshape the AI arms race. It can also have implications for the types of resources that AI companies have an insatiable demand for AI.

(Image Credit: openai)

OpenAI’s next flagship artificial intelligence model is showing smaller improvements compared to previous iterations, as per report. This seems like a sign that the booming generative AI industry may be approaching a plateau.

Why OpenAI’s smarter AI model is needed?

After the release of the viral ChatGPT chatbot two years ago, technology companies, whose valuations have benefited greatly from the AI boom, have publicly maintained that “scaling up” current models through adding more data and computing power will consistently lead to improved AI models.

But now, some of the most prominent AI scientists are speaking out on the limitations of this “bigger is better” philosophy.

AI model limitations

As per Ilya Sutskever, co-founder of AI labs Safe Superintelligence (SSI) and OpenAI, results from scaling up pre-training have plateaued.

“The 2010s were the age of scaling, now we’re back in the age of wonder and discovery once again. Everyone is looking for the next thing,” Sutskever said. “Scaling the right thing matters more now than ever.”

OpenAI’s rivals working on smarter AI models

Sutskever declined to share more details on how his team is addressing the issue, other than saying SSI is working on an alternative approach to scaling up pre-training.

Behind the scenes, researchers at major AI labs have been running into delays and disappointing outcomes in the race to release a large language model that outperforms OpenAI’s GPT-4 model, which is nearly two years old, according to three sources familiar with private matters.

AI advancements challenges

The so-called ‘training runs’ for large models are expensive, could cost tens of millions of dollars by simultaneously running hundreds of chips. They are more likely to have hardware-induced failure given how complicated the system is; researchers may not know the eventual performance of the models until the end of the run, which can take months.

Large language models gobble up huge amounts of data, with AI models exhausting all the easily accessible data in the world. Power shortages have also hindered the training runs, as the process requires vast amounts of energy.

Test-time compute

To overcome these challenges, researchers are exploring “test-time compute,” a technique that enhances existing AI models during the so-called “inference” phase, or when the model is being used.

This method allows models to dedicate more processing power to challenging tasks like math or coding problems or complex operations that demand human-like reasoning and decision-making.

OpenAI’s new AI model

OpenAI has embraced this technique in their newly released model known as “o1,” formerly known as Q* and Strawberry. The O1 model can “think” through problems in a multi-step manner, similar to human reasoning. It also involves using data and feedback curated from PhDs and industry expert s . The secret sauce of the o1 series is another set of training carried out on top of ‘base’ models like GPT-4, and the company says it plans to apply this technique with more and bigger base models.

At the same time, researchers at other top AI labs, from Anthropic, xAI, and Google DeepMind, have also been working to develop their own versions of the technique, according to five people familiar with the efforts.

Google and xAI did not respond to requests for comment and Anthropic had no immediate comment.

Implications of OpenAI’s smarter AI model

The implications could alter the competitive landscape for AI hardware, thus far dominated by insatiable demand for Nvidia’s AI chips.

“This shift will move us from a world of massive pre-training clusters toward inference clouds, which are distributed, cloud-based servers for inference,” said Sonya Huang, a partner at Sequoia Capita.

Unlike training chips, where Nvidia dominates, the chip giant could face more competition in the inference market.

Asked about the possible impact on demand for its products, Nvidia pointed to recent company presentations on the importance of the technique behind the o1 model. Its CEO Jensen Huang has talked about increasing demand for using its chips for inference.

Source link