Before launching, GPT-4o broke records on chatbot leaderboard under a secret name

Enlarge (credit: Getty Images)

On Monday, OpenAI employee William Fedus confirmed on X that a mysterious chat-topping AI chatbot known as “gpt-chatbot” that had been undergoing testing on LMSYS’s Chatbot Arena and frustrating experts was, in fact, OpenAI’s newly announced GPT-4o AI model. He also revealed that GPT-4o had topped the Chatbot Arena leaderboard, achieving the highest documented score ever.

“GPT-4o is our new state-of-the-art frontier model. We’ve been testing a version on the LMSys arena as im-also-a-good-gpt2-chatbot,” Fedus tweeted.

Chatbot Arena is a website where visitors converse with two random AI language models side by side without knowing which model is which, then choose which model gives the best response. It’s a perfect example of vibe-based AI benchmarking, as AI researcher Simon Willison calls it.

Read 8 remaining paragraphs | Comments

Before launching, GPT-4o broke records on chatbot leaderboard under a secret name

Anker Offers MacRumors Readers 20% Off Collection of Chargers, Hubs, Batteries, and More

Hostinger Horizons lets you effortlessly turn ideas into web apps without coding [10% off]

New York City subway riders will soon be able to utilize transit cards on iPhone

Here’s how Apple plans to fix Siri in iOS 19

Which Apple Watch should you buy?

You may have missed

Anker Offers MacRumors Readers 20% Off Collection of Chargers, Hubs, Batteries, and More

Hostinger Horizons lets you effortlessly turn ideas into web apps without coding [10% off]

New York City subway riders will soon be able to utilize transit cards on iPhone

Here’s how Apple plans to fix Siri in iOS 19

Which Apple Watch should you buy?