When OpenAI unveiled its o3 “reasoning” AI mannequin in December, the corporate partnered with the creators of ARC-AGI, a benchmark designed to check extremely succesful AI, to showcase o3’s capabilities. Months later, the outcomes have been revised, and so they now look barely much less spectacular than they did initially.
Final week, the Arc Prize Basis, which maintains and administers ARC-AGI, up to date its approximate computing prices for o3. The group initially estimated that the best-performing configuration of o3 it examined, o3 excessive, value round $3,000 to resolve a single ARC-AGI downside. Now, the Arc Prize Basis thinks that the associated fee is far greater — probably round $30,000 per process.
The revision is notable as a result of it illustrates simply how costly in the present day’s most refined AI fashions might find yourself being for sure duties, a minimum of early on. OpenAI has but to cost o3 — or launch it, even. However the Arc Prize Basis believes OpenAI’s o1-pro mannequin pricing is an inexpensive proxy.
For context, o1-pro is OpenAI’s most costly mannequin to this point.
“We imagine o1-pro is a better comparability of true o3 value […] attributable to quantity of test-time compute used,” Mike Knoop, one of many co-founders of The Arc Prize Basis, advised TechCrunch. “However that is nonetheless a proxy, and we’ve saved o3 labeled as preview on our leaderboard to replicate the uncertainty till official pricing is introduced.”
A excessive value for o3 excessive wouldn’t be out of the query, given the quantity of computing sources the mannequin reportedly makes use of. In response to the Arc Prize Basis, o3 excessive used 172x extra computing than o3 low, the lowest-computing configuration of o3, to deal with ARC-AGI.
Furthermore, rumors have been flying for fairly a while about expensive plans OpenAI is contemplating introducing for enterprise prospects. In early March, The Data reported that the corporate could also be planning to cost as much as $20,000 per thirty days for specialised AI “brokers,” like a software program developer agent.
Some would possibly argue that even OpenAI’s priciest fashions will value properly beneath what a typical human contractor or staffer would command. However as AI researcher Toby Ord identified in a put up on X, the fashions might not be as environment friendly. For instance, o3 excessive wanted 1,024 makes an attempt at every process in ARC-AGI to realize its finest rating.