OpenAI sidesteps Nvidia with unusually quick coding mannequin on plate-sized chips

On Thursday, OpenAI launched its first manufacturing AI mannequin to run on non-Nvidia {hardware}, deploying the brand new GPT-5.3-Codex-Spark coding mannequin on chips from Cerebras. The mannequin delivers code at greater than 1,000 tokens (chunks of information) per second, which is reported to be roughly 15 instances quicker than its predecessor. To match, Anthropic’s Claude Opus 4.6 in its new premium-priced quick mode reaches about 2.5 instances its commonplace pace of 68.2 tokens per second, though it’s a bigger and extra succesful mannequin than Spark.

“Cerebras has been an amazing engineering accomplice, and we’re enthusiastic about including quick inference as a brand new platform functionality,” Sachin Katti, head of compute at OpenAI, mentioned in an announcement.

Codex-Spark is a analysis preview out there to ChatGPT Professional subscribers ($200/month) by the Codex app, command-line interface, and VS Code extension. OpenAI is rolling out API entry to pick out design companions. The mannequin ships with a 128,000-token context window and handles textual content solely at launch.

Learn full article

Feedback