New AI can draw pictures, inching closer to humanlike smarts

DALL-E’s offbeat images might not be perfect, but they demonstrate that AI is slowly gaining grounds toward humanlike creativity.
Subscribe to Freethink on Substack for free
Get our favorite new stories right to your inbox every week

OpenAI has just introduced two new machine learning algorithms that improve computer vision and can use text cues to draw unique and often offbeat images — like a dog-walking radish wearing a tutu.

Even though it is still a long way from replacing the discerning human eye, it shows that creative AI is gaining momentum. 

What They Do

Like GPT-3 (the OpenAI text generator), DALL-E, a neural network model, aims to “think” like humans.

DALL-E’s training uses images and associated text prompts. Then, based on what it’s learned, it responds to a text prompt like “an armchair in the shape of an avocado.”

Instead of responding with words, the AI responds by creating hundreds of pictures. Then CLIP (another new neural network) ranks them to find the best few dozen. And, surprisingly, the images often appear genuine, as if a human made them.

For example, a prompt that says “storefront with that has the word openai written on it,” will generate an image like this:

Or “an armchair in the shape of an avocado” will prompt the following image:

“Last year, we were able to make substantial progress on text with GPT-3, but the thing is that the world isn’t just built on text,” Ilya Sutskever, OpenAI co-founder and chief scientist, reports to Axios. “This is a step towards the grander goal of building a neural network that can work in both images and text.”

Why It Matters

These new models are the next step toward achieving machine learning algorithms that can carry out tasks that have real-world value, while promising to show general human intelligence — sort of.

But it’s more than just a whimsical way to make cute pictures — like  “an illustration of a baby daikon radish in a tutu walking a dog,” — the machine learning algorithm’s advantage is efficiency.

Training a new model can take a lot of computer power, but Sutskever, according to Axios, says that CLIP improves existing computer vision techniques with less computational cost.

So, Are AIs Going to Take Over?

GPT-3 is the third generation of autocomplete tools designed by OpenAI. It looks for patterns in large amounts of data and then predicts what words should come after a text prompt. A simple example is if you input “fire,” it might add “truck” or “alarm.”

But OpenAI claims that GPT-3 can do more than that — even write full essays or poems.

When it was first released in June 2020, the media was buzzing about its capabilities. GPT-3 could be an important step toward a future where AI can exhibit a human-like ability to reason. But it also attracted criticism because the text it generated sometimes appeared to be unhinged from reality.

While DALL-E is one small step for artificial intelligence — inching toward achieving human creativity’s likeness — it is still far from perfect. It still needs input from a grammar expert, as poorly worded phrases result in fumbled pictures.

We’d love to hear from you! If you have a comment about this article or if you have a tip for a future Freethink story, please email us at [email protected].

Subscribe to Freethink on Substack for free
Get our favorite new stories right to your inbox every week
Related
The missing tech case for how we create an era of abundance
AI and other new technologies could make things that are costly and scarce today, cheap and abundant for all tomorrow.
Why America reinvents itself every 80 years — and is doing so again
Three separate theories help explain why America enters a period of great progress every 80 years — and why another is coming soon.
How DeepSeek rewrote the rules of the AI race
Chinese startup DeepSeek has proven that vast quantities of capital and cutting-edge chips aren’t prerequisites for world-class AI.
Kevin Kelly points a new way forward into the Age of AI
One of the most original and optimistic thinkers in America helps build out some big through lines on what’s possible with AI in the next 25 years.
The artifact isn’t the art: Rethinking creativity in the age of AI
ChatGPT’s Studio Ghibli imitations invite questions about the creative value of people and what we really mean when we talk about creativity.
Up Next
smart vaccine device
Subscribe to Freethink for more great stories