“DALL-E 2 of biology” designs proteins for new drugs

"Now that we have this ability, the possibilities of what we can produce are endless."

The recent release of powerful text-to-image AIs like DALL-E 2 has given anyone the ability to generate photorealistic images based on nothing but short text prompts. 

Now, the same AI technique is being used to generate complex, never-seen-before proteins on-demand — and these “programmable proteins” could one day be used to treat countless medical conditions.

Proteins 101: Proteins are hugely important to life — these molecules give our cells shape, power the immune system, help build and repair our tissues, transport oxygen throughout the body, and so much more.

A protein’s structure determines its function — and there are a mind-boggling number of possible structures.

Each protein is made up of a long string of chemical compounds called “amino acids.” Twenty types of amino acids can be found in proteins, and a single protein chain can be thousands of amino acids long.

The amino acids in a protein cause it to fold into a complex three-dimensional structure. A protein’s structure determines its function, so if we know that, we have a better idea of what the protein does and how it works — and that can have huge implications for medicine.

Our understanding of the structure of the coronavirus’ spike protein, for example, was key to the development of COVID-19 vaccines. Monoclonal antibodies, meanwhile, are clones of proteins that we make in the lab; they’re used to treat infections, cancers, Alzheimer’s, and more.

AI advances: There are a mind-boggling number of possible protein structures — one estimate puts the number at a googol cubed, or 1 followed by 300 zeroes — and traditionally, the process of identifying a single protein’s structure has been expensive and time-consuming. 

“It is akin to learning how to ​’write’ in the mysterious language of proteins.”

Gevorg Grigoryan

That changed with the development of AlphaFold, an AI that can accurately predict how a protein will fold based on its sequence of amino acids. AlphaFold was a huge boon for research, giving scientists access to the basic structures of all 200 million known proteins.

But now, Boston-based startup Generate Biomedicines is further advancing our understanding and use of proteins by training an AI called “Chroma” to create proteins with structures no one has ever seen before.

“We believe our model will have revolutionary implications,” said Gevorg Grigoryan, Generate’s co-founder and CTO. ​”It is akin to learning how to ​‘write’ in the mysterious language of proteins. Now that we have this ability, the possibilities of what we can produce are endless.”

protein folding AI
Examples of proteins generated by Chroma. Credit: Generate Biomedicines

How it works: Generate described Chroma to MIT Technology Review as the “DALL-E 2 of biology,” and as is the case with the text-to-image AI, the generation process starts with a user submitting a request — they might ask for a protein with a certain size, shape, or function, for example. 

The AI will then use the same technique utilized by DALL-E 2 — diffusion modeling — to generate a protein that contains the right amino acids folded in the right way to meet the constraints of the prompt.

In a paper now available as a preprint, the Generate team showed how Chroma could be used to design proteins in the shapes of all 26 letters of the Latin alphabet and the numerals 0 through 9.

protein folding AI
Proteins designed to match the shapes of letters and numbers. Credit: Generate Biomedicines

They also demonstrated how the system can be used to generate giant proteins with thousands of amino acids and “complexes” containing multiple proteins of different shapes.

protein folding AI
A protein containing 2,000 amino acids (left) and a complex containing multiple proteins (right). Credit: Generate Biomedicines

The big picture: Just like DALL-E 2 wasn’t the first text-to-image AI, Generate’s Chroma isn’t the first AI designed to generate new proteins, but it is trained on more data than past efforts and gives researchers more control over the type of protein produced.

“It may be fair to say that this is more like DALL-E because of how they’ve scaled things up,” Namrata Anand, who shared a paper in May 2022 detailing a protein-generating AI she’d co-developed, told MIT Tech.

“At the end of the day what matters is whether we can make medicines that work or not.”

Gevorg Grigoryan

Designing new proteins is just the first step to revolutionizing healthcare, though.

The Generate team is now focusing on recreating some of their AI’s designs in the lab. After that will come the lengthy process of developing therapies using the novel proteins and then testing them in animals and humans.

“We’re a drug company,” Grigoryan told MIT Tech. “At the end of the day what matters is whether we can make medicines that work or not.”

We’d love to hear from you! If you have a comment about this article or if you have a tip for a future Freethink story, please email us at [email protected].

Related
How to build the skills needed for the age of AI
Knowledge-based workers already need to skill-up to coexist with sophisticated artificial intelligence technologies.
See how Moderna is using OpenAI tech across its workforce
A partnership between Moderna and OpenAI provides a real-world example of what can happen when a company leans into generative AI.
Shining a light on oil fields to make them more sustainable
Sensors and analytics give oil well operators real-time alerts when things go wrong, so they can respond before they become disasters.
OpenAI’s GPT-4 outperforms doctors in another new study
OpenAI’s most powerful AI model, GPT-4, outperformed junior doctors in deciding how to treat patients with eye problems.
Watch the first AI vs. human dogfight using military jets
An AI fighter pilot faced off against a human pilot in a “dogfight” using actual planes — a huge milestone in military automation.
Up Next
Subscribe to Freethink for more great stories