GPT-4 scores in the top 1% of test-takers for creative thinking

The emergence of AI-based creativity – along with examples of both its promise and peril – is likely just beginning.

By Erik Guzik

October 5, 2023

By Erik Guzik

October 5, 2023

By Erik Guzik

October 5, 2023

Of all the forms of human intellect that one might expect artificial intelligence to emulate, few people would likely place creativity at the top of their list. Creativity is wonderfully mysterious – and frustratingly fleeting. It defines us as human beings – and seemingly defies the cold logic that lies behind the silicon curtain of machines.

Yet, the use of AI for creative endeavors is now growing.

New AI tools like DALL-E and Midjourney are increasingly part of creative production, and some have started to win awards for their creative output. The growing impact is both social and economic – as just one example, the potential of AI to generate new, creative content is a defining flashpoint behind the Hollywood writers strike.

And if our recent study into the striking originality of AI is any indication, the emergence of AI-based creativity – along with examples of both its promise and peril – is likely just beginning.

A blend of novelty and utiliy

When people are at their most creative, they’re responding to a need, goal or problem by generating something new – a product or solution that didn’t previously exist.

In this sense, creativity is an act of combining existing resources – ideas, materials, knowledge – in a novel way that’s useful or gratifying. Quite often, the result of creative thinking is also surprising, leading to something that the creator did not – and perhaps could not – foresee.

It might involve an invention, an unexpected punchline to a joke or a groundbreaking theory in physics. It might be a unique arrangement of notes, tempo, sounds and lyrics that results in a new song.

So, as a researcher of creative thinking, I immediately noticed something interesting about the content generated by the latest versions of AI, including GPT-4.

When prompted with tasks requiring creative thinking, the novelty and usefulness of GPT-4’s output reminded me of the creative types of ideas submitted by students and colleagues I had worked with as a teacher and entrepreneur.

The ideas were different and surprising, yet relevant and useful. And, when required, quite imaginative.

Consider the following prompt offered to GPT-4: “Suppose all children became giants for one day out of the week. What would happen?” The ideas generated by GPT-4 touched on culture, economics, psychology, politics, interpersonal communication, transportation, recreation and much more – many surprising and unique in terms of the novel connections generated.

This combination of novelty and utility is difficult to pull off, as most scientists, artists, writers, musicians, poets, chefs, founders, engineers and academics can attest.

Yet AI seemed to be doing it – and doing it well.

Putting AI to the test

With researchers in creativity and entrepreneurship Christian Byrge and Christian Gilde, I decided to put AI’s creative abilities to the test by having it take the Torrance Tests of Creative Thinking, or TTCT.

The TTCT prompts the test-taker to engage in the kinds of creativity required for real-life tasks: asking questions, how to be more resourceful or efficient, guessing cause and effect or improving a product. It might ask a test-taker to suggest ways to improve a children’s toy or imagine the consequences of a hypothetical situation, as the above example demonstrates.

The tests are not designed to measure historical creativity, which is what some researchers use to describe the transformative brilliance of figures like Mozart and Einstein. Rather, it assesses the general creative abilities of individuals, often referred to as psychological or personal creativity.

In addition to running the TTCT through GPT-4 eight times, we also administered the test to 24 of our undergraduate students.

All of the results were evaluated by trained reviewers at Scholastic Testing Service, a private testing company that provides scoring for the TTCT. They didn’t know in advance that some of the tests they’d be scoring had been completed by AI.

Since Scholastic Testing Service is a private company, it does not share its prompts with the public. This ensured that GPT-4 would not have been able to scrape the internet for past prompts and their responses. In addition, the company has a database of thousands of tests completed by college students and adults, providing a large, additional control group with which to compare AI scores.

Our results?

GPT-4 scored in the top 1% of test-takers for the originality of its ideas. From our research, we believe this marks one of the first examples of AI meeting or exceeding the human ability for original thinking.

In short, we believe that AI models like GPT-4 are capable of producing ideas that people see as unexpected, novel and unique. Other researchers are arriving at similar conclusions in their research of AI and creativity.

Yes, creativity can be evaluated

The emerging creative ability of AI is surprising for a number of reasons.

For one, many outside of the research community continue to believe that creativity cannot be defined, let alone scored. Yet products of human novelty and ingenuity have been prized – and bought and sold – for thousands of years. And creative work has been defined and scored in fields like psychology since at least the 1950s.

The person, product, process, press model of creativity, which researcher Mel Rhodes introduced in 1961, was an attempt to categorize the myriad ways in which creativity had been understood and evaluated until that point. Since then, the understanding of creativity has only grown.

Still others are surprised that the term “creativity” might be applied to nonhuman entities like computers. On this point, we tend to agree with cognitive scientist Margaret Boden, who has argued that the question of whether the term creativity should be applied to AI is a philosophical rather than scientific question.

AI’s founders foresaw its creative abilities

It’s worth noting that we studied only the output of AI in our research. We didn’t study its creative process, which is likely very different from human thinking processes, or the environment in which the ideas were generated. And had we defined creativity as requiring a human person, then we would have had to conclude, by definition, that AI cannot possibly be creative.

But regardless of the debate over definitions of creativity and the creative process, the products generated by the latest versions of AI are novel and useful. We believe this satisfies the definition of creativity that is now dominant in the fields of psychology and science.

Furthermore, the creative abilities of AI’s current iterations are not entirely unexpected.

In their now famous proposal for the 1956 Dartmouth Summer Research Project on Artificial Intelligence, the founders of AI highlighted their desire to simulate “every aspect of learning or any other feature of intelligence” – including creativity.

In this same proposal, computer scientist Nathaniel Rochester revealed his motivation: “How can I make a machine which will exhibit originality in its solution of problems?”

Apparently, AI’s founders believed that creativity, including the originality of ideas, was among the specific forms of human intelligence that machines could emulate.

To me, the surprising creativity scores of GPT-4 and other AI models highlight a more pressing concern: Within U.S. schools, very few official programs and curricula have been implemented to date that specifically target human creativity and cultivate its development.

In this sense, the creative abilities now realized by AI may provide a “Sputnik moment” for educators and others interested in furthering human creative abilities, including those who see creativity as an essential condition of individual, social and economic growth.

This article is republished from The Conversation under a Creative Commons license. Read the original article.