“DALL-E 2 of biology” designs proteins for new drugs

"Now that we have this ability, the possibilities of what we can produce are endless."

The recent release of powerful text-to-image AIs like DALL-E 2 has given anyone the ability to generate photorealistic images based on nothing but short text prompts. 

Now, the same AI technique is being used to generate complex, never-seen-before proteins on-demand — and these “programmable proteins” could one day be used to treat countless medical conditions.

Proteins 101: Proteins are hugely important to life — these molecules give our cells shape, power the immune system, help build and repair our tissues, transport oxygen throughout the body, and so much more.

A protein’s structure determines its function — and there are a mind-boggling number of possible structures.

Each protein is made up of a long string of chemical compounds called “amino acids.” Twenty types of amino acids can be found in proteins, and a single protein chain can be thousands of amino acids long.

The amino acids in a protein cause it to fold into a complex three-dimensional structure. A protein’s structure determines its function, so if we know that, we have a better idea of what the protein does and how it works — and that can have huge implications for medicine.

Our understanding of the structure of the coronavirus’ spike protein, for example, was key to the development of COVID-19 vaccines. Monoclonal antibodies, meanwhile, are clones of proteins that we make in the lab; they’re used to treat infections, cancers, Alzheimer’s, and more.

AI advances: There are a mind-boggling number of possible protein structures — one estimate puts the number at a googol cubed, or 1 followed by 300 zeroes — and traditionally, the process of identifying a single protein’s structure has been expensive and time-consuming. 

“It is akin to learning how to ​’write’ in the mysterious language of proteins.”

Gevorg Grigoryan

That changed with the development of AlphaFold, an AI that can accurately predict how a protein will fold based on its sequence of amino acids. AlphaFold was a huge boon for research, giving scientists access to the basic structures of all 200 million known proteins.

But now, Boston-based startup Generate Biomedicines is further advancing our understanding and use of proteins by training an AI called “Chroma” to create proteins with structures no one has ever seen before.

“We believe our model will have revolutionary implications,” said Gevorg Grigoryan, Generate’s co-founder and CTO. ​”It is akin to learning how to ​‘write’ in the mysterious language of proteins. Now that we have this ability, the possibilities of what we can produce are endless.”

protein folding AI
Examples of proteins generated by Chroma. Credit: Generate Biomedicines

How it works: Generate described Chroma to MIT Technology Review as the “DALL-E 2 of biology,” and as is the case with the text-to-image AI, the generation process starts with a user submitting a request — they might ask for a protein with a certain size, shape, or function, for example. 

The AI will then use the same technique utilized by DALL-E 2 — diffusion modeling — to generate a protein that contains the right amino acids folded in the right way to meet the constraints of the prompt.

In a paper now available as a preprint, the Generate team showed how Chroma could be used to design proteins in the shapes of all 26 letters of the Latin alphabet and the numerals 0 through 9.

protein folding AI
Proteins designed to match the shapes of letters and numbers. Credit: Generate Biomedicines

They also demonstrated how the system can be used to generate giant proteins with thousands of amino acids and “complexes” containing multiple proteins of different shapes.

protein folding AI
A protein containing 2,000 amino acids (left) and a complex containing multiple proteins (right). Credit: Generate Biomedicines

The big picture: Just like DALL-E 2 wasn’t the first text-to-image AI, Generate’s Chroma isn’t the first AI designed to generate new proteins, but it is trained on more data than past efforts and gives researchers more control over the type of protein produced.

“It may be fair to say that this is more like DALL-E because of how they’ve scaled things up,” Namrata Anand, who shared a paper in May 2022 detailing a protein-generating AI she’d co-developed, told MIT Tech.

“At the end of the day what matters is whether we can make medicines that work or not.”

Gevorg Grigoryan

Designing new proteins is just the first step to revolutionizing healthcare, though.

The Generate team is now focusing on recreating some of their AI’s designs in the lab. After that will come the lengthy process of developing therapies using the novel proteins and then testing them in animals and humans.

“We’re a drug company,” Grigoryan told MIT Tech. “At the end of the day what matters is whether we can make medicines that work or not.”

We’d love to hear from you! If you have a comment about this article or if you have a tip for a future Freethink story, please email us at [email protected].

Related
ChatGPT-like AI creates new bacteria-killing proteins
Using a large language model AI, biotech startup Profluent has created new antimicrobial proteins.
Your microbiome is influenced by the people you hang out with
The human microbiome is largely influenced by our social interactions, according to the largest study to date of microbiome transmission.
mRNA could train our blood cells to stop chronic inflammation
A new study has identified the substance used by our white blood cells to reduce inflammation when it is no longer needed.
Why 2023 will be “the year of mixed reality”
Mixed reality, in which immersive virtual content is seamlessly combined with our physical world, is set to transform the world.
Simple tweak to cancer treatment reduces relapse risk by 28%
Delivering chemotherapy to colon cancer patients before and after surgery — instead of just after — reduces their risk of recurrence by 28%.
Up Next
Subscribe to Freethink for more great stories