Google’s new Gemini AI beats GPT-4 in 30 of 32 tests

But will the difference be enough to matter in real life?
Sign up for the Freethink Weekly newsletter!
A collection of our favorite stories straight to your inbox

Tech giant Google has finally unveiled its much-hyped Gemini AI, a series of generative AI models it claims are its “largest and most capable” to date. 

“This new era of models represents one of the biggest science and engineering efforts we’ve undertaken as a company,” said Google CEO Sundar Pichai. 

Multimodal AI: Generative AIs are algorithms trained to create original content in response to user prompts. OpenAI’s first iteration of ChatGPT, for example, can understand and produce human-like text, while its DALL-E 2 system can generate images based on text prompts. 

While those systems understand and generate just one type of content, a multimodal generative AI can work with several — in September, OpenAI announced a multimodal version of ChatGPT that could understand image, voice, and text inputs.

“Its capabilities are state-of-the-art in nearly every domain.”

Demis Hassabis

The Gemini era: According to Google, multimodal AIs are traditionally created by combining separate, specialized models into one program, but it took a different approach with its Gemini AI, training it to be multimodal from the start.

“This helps Gemini seamlessly understand and reason about all kinds of inputs from the ground up, far better than existing multimodal models — and its capabilities are state-of-the-art in nearly every domain,” wrote Demis Hassabis, CEO and cofounder of Google DeepMind.

In addition to being highly capable, Google says the Gemini AI is also its “most flexible” model. This has allowed the company to create three different sizes of the AI: Ultra, Nano, and Pro. 

  • Gemini Ultra is the most powerful model, designed for complex tasks. According to Google, it’s the first generative AI model to outperform human experts on the MMLU, a benchmark assessing knowledge across 57 subjects. Google is currently soliciting feedback on Ultra from select users, but expects to make it widely available in 2024.
  • Gemini Nano is the least capable model, but it’s small and efficient enough to run locally on smartphones. Google has already made it available on its Pixel 8 Pro — owners of that smartphone can use the AI to summarize audio recordings or generate responses to WhatsApp messages.
  • Gemini Pro, meanwhile, falls between Nano and Ultra in terms of capabilities and size. Google has integrated an English-language version of that model into its ChatGPT-like Bard, which will reportedly get an Ultra upgrade in 2024.

The big picture: Like the rest of the tech industry, Google has been racing to catch up with OpenAI in the generative AI space ever since the release of ChatGPT in 2022, and it’s been hyping the Gemini AI for months as the tech that will put it ahead. 

While Gemini did outperform OpenAI’s GPT-4 on 30 of 32 benchmarks tested (including the MMLU), the difference was often just a percentage point or two — meaning Google may be ahead, but only by a little and only compared to an AI model that’s been out for 9 months already.

“It’s clear that Gemini is a very sophisticated AI system … [but] it’s not obvious to me that Gemini is actually substantially more capable than GPT-4,” Melanie Mitchell, an AI researcher at the Santa Fe Institute in New Mexico, told MIT Technology Review.

We’d love to hear from you! If you have a comment about this article or if you have a tip for a future Freethink story, please email us at [email protected].

Related
The next big tech trend will start out looking like a toy
In “Read, Write, Own: Building The Next Era of the Internet,” investor Chris Dixon explains why the biggest trends often go overlooked.
Meet Thresh, the world’s first professional gamer
Was Elon Musk any good at Quake? “He’s a legit gamer,” but…
You’re thinking of the metaverse all wrong, says Matthew Ball
Rumors of the metaverse’s demise have been greatly exaggerated.
Constitutional warning shot for social media “deplatforming” laws
Can the government tell private websites what they have to publish?
Perplexity, Google, and the battle for AI search supremacy
AIs that generate answers to user queries could transform search, but only if someone can get the tech and the business model right.
Up Next
A black and white photo of the advice columnist known as 'Dear Abby' with generative text collage elements.
Subscribe to Freethink for more great stories