AI creates realistic pictures from pure text

The system makes it faster and easier to create photorealistic AI art.

Graphics processing unit maker NVIDIA has debuted a new way to create AI art. The program, called GauGAN2, can create photorealistic images using a text interface — in other words, type what you want to see and the software generates a picture of it.

“The deep learning model behind GauGAN allows anyone to channel their imagination into photorealistic masterpieces — and it’s easier than ever,” NVIDIA’s Isha Salian wrote in a blog post.

Generating AI art: The system uses deep learning to power its AI art algorithm. 

Deep learning is a specific form of machine learning — where an AI “learns” from large amounts of data — which is modeled after the human brain.

The AI can create realistic images using a text interface —type what you want to see and the software generates a picture of it.

Much like how your brain uses groups of neurons working in unison to puzzle through problems and generate thoughts, a deep learning AI uses what are called “neural nets” to perform some specific function. Deep learning is especially good at picking out images, or creating them.

Text to art: NVIDIA’s AI can turn ordinary text into images, which can then be edited or filled out with more details. 

“Simply type a phrase like ‘sunset at a beach’ and AI generates the scene in real time,” Salian wrote. Adding adjectives like “rocky” and “rainy” will cause GauGAN2 to modify the AI art instantly.

GauGAN2 will create a map of the images (rocks, sun, clouds, sand, water) in the scene, each of which can then be modified and edited by you, either with further text or a hands-on, Photoshop-like editor. This could allow you to take a realistic desert scene and, by popping an extra sun up in the sky, creating a landscape shot of Tatooine (Salian’s example).

Credit: Annelisa Leinbach

The frontiers of AI art: As The Next Web notes, GauGAN2 currently works best with simple descriptions of nature. 

Put in something a bit more complicated, like Tiernan Ray over at ZDNet did, and the end results are abstracted fever dreamscapes filled with Dali-esque amoebas (more a feature for AI art than a bug, in my opinion).

GauGAN2 is the second iteration of an AI originally released in 2019. The first GauGAN used segmentation mapping to help users create AI art. You could create a landscape piecemeal by drawing it in simple ways, like drawing in MS Paint, and GauGAN would fill in your segments with photoreal images, Ray explains.

NVIDIA says GauGAN2 is the first AI of its kind to be able to interpret commands using multiple methods, or modalities. 

“This makes it faster and easier to turn an artist’s vision into a high-quality AI-generated image,” Salian wrote.

We’d love to hear from you! If you have a comment about this article or if you have a tip for a future Freethink story, please email us at [email protected].

Related
AI narrates 5,000 free audiobooks for Project Gutenberg
A new text-to-speech system developed by Microsoft and MIT was used to create nearly 5,000 audiobooks for Project Gutenberg.
Why Toyota is building a “kindergarten for robots”
Toyota is using a generative AI-based method to teach robots to peel veggies, prepare snacks, and perform other dexterous tasks.
UT med students can now get a dual degree in AI
The University of Texas at San Antonio has launched what it says is the US’s first dual degree in medicine and AI.
Self-driving cars can now tell passengers what they’re thinking
AV startup Wayve has given its self-driving cars the ability to explain their decisions in conversational language.
First-of-its-kind robot receptionist is like ChatGPT with a face
Engineers have combined the AI model powering ChatGPT with a humanoid bust to create a robot receptionist for the UK National Robotarium.
Up Next
text to code
Subscribe to Freethink for more great stories