“This DNA is not real”: Why scientists are deepfaking the human genome

Researchers taught an AI to make artificial genomes, possibly opening new doors for genetic research.

Subscribe to Freethink on Substack for free
Get our favorite new stories right to your inbox every week

Researchers have taught an AI to make artificial genomes — possibly overcoming the problem of how to protect people’s genetic information while also amassing enough DNA for research.

Generative adversarial networks (GANs) pit two neural networks against each other to produce new, synthetic data that is so good it can pass for real data. Examples have been popping up all over the web — generating pictures and videos (a la “this city does not exist“). AIs can even generate convincing news articles, food blogs, or human faces (take a look here for a complete list of all the oddities created by GANs).

Now, researchers from Estonia are going more in-depth with deepfakes of human DNA. They created an algorithm that repeatedly generates the genetic code of people that don’t exist.

Deepfaking Human DNA

It may seem simple — randomly mix A, T, C, and G, the letters that make up the genetic code — and voila, a human genetic sequence. But not any random pattern of the letters will work. The AI needs to understand humans at the molecular level. This AI has figured it out.

Like the horse deepfakes, the artificial genomes are a convincing copy of a viable person — a human, the researchers believe, who really could exist but doesn’t.

Most importantly, they could play an important role in genetic research.

“A known limitation in the field (of genetic studies) is the reduced access to many genetic databases due to concerns about violations of individual privacy,” the team writes in their study, published in PLOS Genetics.

The team reports that these “artificial genomes” mimic real genomes so much that they are indistinguishable. But since they aren’t real, researchers can mine the data without worrying about privacy concerns. They can experiment with genomes without actual people giving up their private information.

Protecting the privacy of the people behind genetic information is challenging and often limits how researchers can use that DNA and their willingness to share datasets. But with artificial genomes, researchers don’t have to worry about many of these ethical privacy concerns.

Faking Something You Don’t Fully Understand

The process of using GANs to generate synthetic genomes isn’t akin to making a deepfake of a person’s face. A face is something we are all familiar with and have countless examples with which to train the AI.

But there is so much about DNA and the genome that remains a mystery.

“My initial take is that it is interesting, but I’m not sure I see real practical implications for research right now,” Deanna Church, vice president of the Mammalian Business Area and Software Strategy at the biotech company Inscripta, told Futurism.

“Just because you can’t computationally distinguish these generated genomes from real genomes doesn’t mean they’ve really preserved functional motifs and domains that are important — there is much of this we still don’t understand.”

Even if the artificial genomes resolve the privacy hurdle in genetic research, they raise some possible new concerns.

“In the near term, it’s going to get easier for bad actors to create fake personas that can stand up to even the most rigorous inspection. Not that we envision a scenario where a scam artist needs to provide a fake transcript of their genome, but the unknown unknowns are where security holes tend to grow the fastest,” writes Tristan Greene in The Next Web.

We’d love to hear from you! If you have a comment about this article or if you have a tip for a future Freethink story, please email us at [email protected].

Subscribe to Freethink on Substack for free
Get our favorite new stories right to your inbox every week
Related
The missing tech case for how we create an era of abundance
AI and other new technologies could make things that are costly and scarce today, cheap and abundant for all tomorrow.
Why America reinvents itself every 80 years — and is doing so again
Three separate theories help explain why America enters a period of great progress every 80 years — and why another is coming soon.
How DeepSeek rewrote the rules of the AI race
Chinese startup DeepSeek has proven that vast quantities of capital and cutting-edge chips aren’t prerequisites for world-class AI.
Kevin Kelly points a new way forward into the Age of AI
One of the most original and optimistic thinkers in America helps build out some big through lines on what’s possible with AI in the next 25 years.
The artifact isn’t the art: Rethinking creativity in the age of AI
ChatGPT’s Studio Ghibli imitations invite questions about the creative value of people and what we really mean when we talk about creativity.
Up Next
vince lombardi super bowl ad
Subscribe to Freethink for more great stories