Physicists are starting to harness the power of deepfakes

Generative Adversarial Networks, the AI behind deepfakes, are finding a home in physics.

January 2, 2021

Most (in)famous for their ability to create deepfakes, generative adversarial networks (GANs) are machine learning tools for creating incredibly realistic simulated data.

While that ability can be used to make a political figure or celebrity appear to be saying whatever you want, it can work its magic with all kinds of data — not just video.

Now, physicists are turning to GANs to break through barriers that are holding back the science.

We’re heading from deepfakes to deep physics.

The Forger vs. The Inspector

Generative adversarial networks are almost self-explanatory: they generate data by pitting networks against each other.

Sharon Zhou, an expert in deepfakes and an instructor at Stanford and Coursera, tells me to picture an art forger and an art inspector.

The forger is, of course, trying to make flawless copies of priceless artwork. The inspector sees both the real art and the forger’s imitation, and has to guess which is which, based on some basic rules about what a genuine piece looks like.

Based on the inspector’s judgement, the forger goes back to hone its work.

“The art forger realizes ‘oh, you think this one looks realistic?'” Zhou says. “‘I’m gonna keep drawing like this until it looks like (the) Mona Lisa.'”

By pitting the two networks against each other, the GAN sharpens each simulated output until it is so realistic that the inspector, which is also constantly honing its knowledge of what output is real and fake, can’t tell them apart.

The end result is a simulated product — whether it’s Tom Cruise, or the explosive result of two atoms smashing together — that is as realistic as possible (at least, as good as whatever real data we have to train the GAN).

The better the rules you can give the inspector to enforce — a cat’s gotta have two ears and one tail; Tom Cruise has green eyes; an electron doesn’t fly that specific way — the tighter and more realistic the end result will be.

It’s here that physics has a nicely built-in advantage: well-established rules of the natural world, which can help make sure the forger is not making something impossible.

The Physical Limitations

Behind most particle physics experiments is a gleefully simple idea: whip particles around your collider until they, well, collide, and detect the flying debris of the resulting trainwreck. Analyzing those results will give you lots of data about things we already know — and perhaps a few we don’t.

It’s those mysterious unknowns that researchers are interested in. The subatomic events we’re talking about take place at well beyond Sonic the Hedgehog speeds, and because the data is complex and tough to interpret, researchers turn to simulations to make sense of what they saw.

“Our data analysis relies on simulation software of the full experiment that we have,” says Maurizio Pierini, a physicist at CERN.

The simulator is pretty accurate, Pierini says, but “quite slow.”

This creates a bottleneck in analyzing the massive amounts of data they’re collecting, which proves costly, as time, computer processing power, and data storage chews into resources.

The Large Hadron Collider (LHC) is scheduled for an upgrade, too; with this more powerful tool, the (ahem) physical limitations of the research will just get more pressing.

“Basically, the computational needs to simulate the collision will be excessive,” Pierini says. “We won’t be able to sustain it.”

In other words, even though we’ll be capable of doing incredible experiments, we won’t be able to understand what the measurements mean.

The technology behind deepfakes may be a solution.

CERN physicist Sofia Vallecorsa is part of a group using GANs to simulate some of the outputs of these experiments. Compared to the traditional simulations, the GAN can give you results much, much faster.

“You can do this 50,000x faster,” with GANs, Vallecorsa says. And with that speed comes savings — less computational horsepower necessary; no need to store sim data, if you can re-make it quick — and the ability to simulate all that data so researchers can get to doing what they do.

GANs could also potentially be used to simulate experiments that would be too costly, difficult, or impractical to do in reality.

Say you wanted to study what happens when you punch atomic holes into graphene, an atom-thick carbon material. You’ve obviously only got so many holes you can punch before you trash your physical sample, which means you can’t get much data.

But there’s challenges to studying a piece of graphene that would be big enough to have practical implications, however (it turns out that quantum mechanics makes it tricky).

Generative adversarial networks, like the one built by grad students Kyle Mills and Corneel Casert and physicist Isaac Tambly, may be your answer. Called RUGAN, this network can upscale its simulations (that’s the “u” in RUGAN), letting you see what would happen at larger scales without having to do it.

By training RUGAN on the smaller scale experiments, the researchers were able to simulate a large-scale sheet of graphene that acted realistically; essentially, they deepfaked the material.

“It’s almost like super-resolution,” Mills says, blowing up and filling in the data.

Democratizing Physics

Like all deep neural networks, generative adversarial networks are, at their heart, a black box. We know what goes in, we know what comes out — and we can check the answer to see if it works — but we do not know how, exactly, the AI arrived at its answer.

This black box was cause for concern — and pushback — from the broader physics community when it came to using GANs.

“The first time I talked about these things, I’d been told … ‘why do you want to do this? You’re a physicist; don’t you know that these things are going to throw physicists away?'” CERN’s Vallecorsa says.

But as more GAN prototypes are developed — and deliver faster, accurate results — the field’s adversarial stance is softening, researchers who spoke to Freethink said.

It helps that physics has an enviable amount of hard and fast rules that a GAN’s output can be tested against; when it simulates something physically impossible, researchers can tell, feed that information back to the inspector, and make sure the forger gets it right next time.

If those simulated data sets can be proven realistic enough to be trusted, GANs may have a democratizing effect on physics. It’s cheaper to run a GAN on some off-the-shelf graphics cards than to buy time on a supercomputer for your simulations, easier to open up the GAN and insert data from the LHC than fire it up and smash atoms yourself.

“It might open up opportunities for students, too,” Mills says.

The physicists Freethink spoke to believe that the field’s widespread embrace of GANs is still five or so years away. But it’s an adaptation that may have to come, whether scientists are comfortable with the black box or not, because the horsepower to run all those old fashioned simulations just won’t be there.

“In a few years, we’ll be forced to accept this approach,” Vallecorsa laughs. “A very practical problem.”