OpenAI’s GPT-4 outperforms doctors in another new study

It may "know" more about treating eye problems than your own GP.
Subscribe to Freethink on Substack for free
Get our favorite new stories right to your inbox every week

OpenAI’s most powerful AI model outperformed junior doctors in deciding how to treat patients with eye problems and came close to scoring as high as expert ophthalmologists — at least on this test.

The challenge: When doctors are in med school, they rotate across clinical areas, spending time with specialists in surgery, psychiatry, ophthalmology, and more to ensure they’ll have a basic knowledge of all the subjects by the time they get their medical license.

If they become a general practitioner (GP), though, they may rarely use the info they learned in some of those specialties and in treating less common conditions.

GPT-4 significantly outperformed the junior doctors, scoring 69% compared to their 43%.

The idea: Researchers at the University of Cambridge were curious to see whether large language models (LLMs) — AIs that can understand and generate conversational text — could help GPs treat patients with eye problems, something they might not be handling on a day-to-day basis.

For a study published in PLOS Digital Health, they presented GPT-4 — the LLM powering OpenAI’s ChatGPT Plus — with 87 scenarios of patients with a range of eye problems and asked it to choose the best diagnosis or treatment from four options.

They also gave the test to expert ophthalmologists, trainees working to become ophthalmologists, and unspecialized junior doctors, who have about as much knowledge of eye problems as general practitioners.

“The most important thing is to empower patients to decide whether they want computer systems to be involved or not.”

Arun Thirunavukarasu

GPT-4 significantly outperformed the junior doctors, scoring 69% on the test compared to their median score of 43%. It also scored higher than the trainees, who had a median score of 59%, and was pretty close to the median score of the expert ophthalmologists: 76%.

“What this work shows is that the knowledge and reasoning ability of these large language models in an eye health context is now almost indistinguishable from experts,” lead author Arun Thirunavukarasu told the Financial Times.

Looking ahead: The Cambridge team doesn’t think LLMs will replace doctors, but they do envision the systems being integrated into clinical workflows — a GP who is having trouble getting in touch with a specialist for advice on how to treat something they haven’t seen in a while (or ever) could query an AI, for example.

“The most important thing is to empower patients to decide whether they want computer systems to be involved or not,” said Thirunavukarasu. “That will be an individual decision for each patient to make.”

We’d love to hear from you! If you have a comment about this article or if you have a tip for a future Freethink story, please email us at [email protected].

Subscribe to Freethink on Substack for free
Get our favorite new stories right to your inbox every week
Related
The missing tech case for how we create an era of abundance
AI and other new technologies could make things that are costly and scarce today, cheap and abundant for all tomorrow.
Why America reinvents itself every 80 years — and is doing so again
Three separate theories help explain why America enters a period of great progress every 80 years — and why another is coming soon.
How DeepSeek rewrote the rules of the AI race
Chinese startup DeepSeek has proven that vast quantities of capital and cutting-edge chips aren’t prerequisites for world-class AI.
Kevin Kelly points a new way forward into the Age of AI
One of the most original and optimistic thinkers in America helps build out some big through lines on what’s possible with AI in the next 25 years.
The artifact isn’t the art: Rethinking creativity in the age of AI
ChatGPT’s Studio Ghibli imitations invite questions about the creative value of people and what we really mean when we talk about creativity.
Up Next
A view of an orange and blue jet in flight, with desert terrain visible in the background.
Subscribe to Freethink for more great stories