New AI can draw pictures, inching closer to humanlike smarts

DALL-E’s offbeat images might not be perfect, but they demonstrate that AI is slowly gaining grounds toward humanlike creativity.

OpenAI has just introduced two new machine learning algorithms that improve computer vision and can use text cues to draw unique and often offbeat images — like a dog-walking radish wearing a tutu.

Even though it is still a long way from replacing the discerning human eye, it shows that creative AI is gaining momentum. 

What They Do

Like GPT-3 (the OpenAI text generator), DALL-E, a neural network model, aims to “think” like humans.

DALL-E’s training uses images and associated text prompts. Then, based on what it’s learned, it responds to a text prompt like “an armchair in the shape of an avocado.”

Instead of responding with words, the AI responds by creating hundreds of pictures. Then CLIP (another new neural network) ranks them to find the best few dozen. And, surprisingly, the images often appear genuine, as if a human made them.

For example, a prompt that says “storefront with that has the word openai written on it,” will generate an image like this:

Or “an armchair in the shape of an avocado” will prompt the following image:

OpenAI

“Last year, we were able to make substantial progress on text with GPT-3, but the thing is that the world isn’t just built on text,” Ilya Sutskever, OpenAI co-founder and chief scientist, reports to Axios. “This is a step towards the grander goal of building a neural network that can work in both images and text.”

Why It Matters

These new models are the next step toward achieving machine learning algorithms that can carry out tasks that have real-world value, while promising to show general human intelligence — sort of.

But it’s more than just a whimsical way to make cute pictures — like  “an illustration of a baby daikon radish in a tutu walking a dog,” — the machine learning algorithm’s advantage is efficiency.

OpenAI

Training a new model can take a lot of computer power, but Sutskever, according to Axios, says that CLIP improves existing computer vision techniques with less computational cost.

So, Are AIs Going to Take Over?

GPT-3 is the third generation of autocomplete tools designed by OpenAI. It looks for patterns in large amounts of data and then predicts what words should come after a text prompt. A simple example is if you input “fire,” it might add “truck” or “alarm.”

But OpenAI claims that GPT-3 can do more than that — even write full essays or poems.

When it was first released in June 2020, the media was buzzing about its capabilities. GPT-3 could be an important step toward a future where AI can exhibit a human-like ability to reason. But it also attracted criticism because the text it generated sometimes appeared to be unhinged from reality.

While DALL-E is one small step for artificial intelligence — inching toward achieving human creativity’s likeness — it is still far from perfect. It still needs input from a grammar expert, as poorly worded phrases result in fumbled pictures.

We’d love to hear from you! If you have a comment about this article or if you have a tip for a future Freethink story, please email us at [email protected].

Related
Aerospace engineer explains why AI can’t replace air traffic controllers
For everyone’s safety, humans are likely to remain a necessary central component of air traffic control for a long time to come.
Nvidia’s free tool lets you create your own chatbot right on your PC
Nvidia’s Chat with RTX tool lets you create a custom chatbot that runs locally on your PC and can answer questions about your personal files.
How does studying 500 years of the printing press help us tackle the era of AI?
For around 500 years, the printed word shaped our education and culture. What lessons can we learn from it in the new age of AI?
OpenAI’s text-to-video AI, Sora, is futurism come to life
Sora will let anyone transform their ideas directly into video and the implications are breathtaking.
From besting Tetris AI to epic speedruns – inside gaming’s most thrilling feats
Gaming embraces design elements that promote social connection, creativity, a sense of autonomy – and, ultimately, the sheer joy of mastery.
Up Next
smart vaccine device
Subscribe to Freethink for more great stories