OpenAI’s GPT-4 outperforms doctors in another new study

It may "know" more about treating eye problems than your own GP.

OpenAI’s most powerful AI model outperformed junior doctors in deciding how to treat patients with eye problems and came close to scoring as high as expert ophthalmologists — at least on this test.

The challenge: When doctors are in med school, they rotate across clinical areas, spending time with specialists in surgery, psychiatry, ophthalmology, and more to ensure they’ll have a basic knowledge of all the subjects by the time they get their medical license.

If they become a general practitioner (GP), though, they may rarely use the info they learned in some of those specialties and in treating less common conditions.

GPT-4 significantly outperformed the junior doctors, scoring 69% compared to their 43%.

The idea: Researchers at the University of Cambridge were curious to see whether large language models (LLMs) — AIs that can understand and generate conversational text — could help GPs treat patients with eye problems, something they might not be handling on a day-to-day basis.

For a study published in PLOS Digital Health, they presented GPT-4 — the LLM powering OpenAI’s ChatGPT Plus — with 87 scenarios of patients with a range of eye problems and asked it to choose the best diagnosis or treatment from four options.

They also gave the test to expert ophthalmologists, trainees working to become ophthalmologists, and unspecialized junior doctors, who have about as much knowledge of eye problems as general practitioners.

“The most important thing is to empower patients to decide whether they want computer systems to be involved or not.”

Arun Thirunavukarasu

GPT-4 significantly outperformed the junior doctors, scoring 69% on the test compared to their median score of 43%. It also scored higher than the trainees, who had a median score of 59%, and was pretty close to the median score of the expert ophthalmologists: 76%.

“What this work shows is that the knowledge and reasoning ability of these large language models in an eye health context is now almost indistinguishable from experts,” lead author Arun Thirunavukarasu told the Financial Times.

Looking ahead: The Cambridge team doesn’t think LLMs will replace doctors, but they do envision the systems being integrated into clinical workflows — a GP who is having trouble getting in touch with a specialist for advice on how to treat something they haven’t seen in a while (or ever) could query an AI, for example.

“The most important thing is to empower patients to decide whether they want computer systems to be involved or not,” said Thirunavukarasu. “That will be an individual decision for each patient to make.”

We’d love to hear from you! If you have a comment about this article or if you have a tip for a future Freethink story, please email us at [email protected].

Why ChatGPT feels more “intelligent” than Google Search
There will be a moment, coming soon, when AI makes the leap from tool to entity.
Are weight-loss meds the next wonder drugs?
Evidence is mounting that GLP-1 agonists could treat many health issues — including ones that aren’t obviously related to weight.
New AI generates CRISPR proteins unlike any seen in nature
An AI that generates CRISPR proteins is opening the door to gene editors with capabilities beyond what we’ve found in nature.
How Brilliant Labs CEO is creating a “symbiosis of humanity and artificial intelligence”
CEO Bobak Tavangar discusses the philosophy behind Brilliant’s latest device, Frame, and his vision for the future of AI.
“Bionic eye” discovers Plato’s final resting place
Plato’s final resting place has been identified thanks to a “bionic eye” built to read the Herculaneum scrolls.
Up Next
A view of an orange and blue jet in flight, with desert terrain visible in the background.
Subscribe to Freethink for more great stories