AI creates realistic pictures from pure text

The system makes it faster and easier to create photorealistic AI art.

Graphics processing unit maker NVIDIA has debuted a new way to create AI art. The program, called GauGAN2, can create photorealistic images using a text interface — in other words, type what you want to see and the software generates a picture of it.

“The deep learning model behind GauGAN allows anyone to channel their imagination into photorealistic masterpieces — and it’s easier than ever,” NVIDIA’s Isha Salian wrote in a blog post.

Generating AI art: The system uses deep learning to power its AI art algorithm. 

Deep learning is a specific form of machine learning — where an AI “learns” from large amounts of data — which is modeled after the human brain.

The AI can create realistic images using a text interface —type what you want to see and the software generates a picture of it.

Much like how your brain uses groups of neurons working in unison to puzzle through problems and generate thoughts, a deep learning AI uses what are called “neural nets” to perform some specific function. Deep learning is especially good at picking out images, or creating them.

Text to art: NVIDIA’s AI can turn ordinary text into images, which can then be edited or filled out with more details. 

“Simply type a phrase like ‘sunset at a beach’ and AI generates the scene in real time,” Salian wrote. Adding adjectives like “rocky” and “rainy” will cause GauGAN2 to modify the AI art instantly.

GauGAN2 will create a map of the images (rocks, sun, clouds, sand, water) in the scene, each of which can then be modified and edited by you, either with further text or a hands-on, Photoshop-like editor. This could allow you to take a realistic desert scene and, by popping an extra sun up in the sky, creating a landscape shot of Tatooine (Salian’s example).

Credit: Annelisa Leinbach

The frontiers of AI art: As The Next Web notes, GauGAN2 currently works best with simple descriptions of nature. 

Put in something a bit more complicated, like Tiernan Ray over at ZDNet did, and the end results are abstracted fever dreamscapes filled with Dali-esque amoebas (more a feature for AI art than a bug, in my opinion).

GauGAN2 is the second iteration of an AI originally released in 2019. The first GauGAN used segmentation mapping to help users create AI art. You could create a landscape piecemeal by drawing it in simple ways, like drawing in MS Paint, and GauGAN would fill in your segments with photoreal images, Ray explains.

NVIDIA says GauGAN2 is the first AI of its kind to be able to interpret commands using multiple methods, or modalities. 

“This makes it faster and easier to turn an artist’s vision into a high-quality AI-generated image,” Salian wrote.

We’d love to hear from you! If you have a comment about this article or if you have a tip for a future Freethink story, please email us at [email protected]

Related
“Korean Google” opens the world’s first robot-friendly building
Tech giant Naver Corporation designed its new headquarters, 1784, to be a “robot-friendly” testing ground for its latest technologies.
An interview with ChatGPT about itself
Freethink interviews OpenAI’s ChatGPT, an AI chatbot capable of generating conversational text, code, and more in response to prompts.
AI is helping to ID victims in Holocaust photos
The new website From Numbers to Names uses facial recognition tech to identify previously anonymous faces in Holocaust photos.
“Robot rooms” could be the future of homes and offices 
Integrating robots into walls, ceilings, furniture, and appliances could radically change our indoor spaces.
New “risky” playground could make kids anti-fragile
A new playground in Melbourne’s Southbank is the work of artist Mike Hewson, who introduces “risk” back into play.
Up Next
text to code
Subscribe to Freethink for more great stories