Every time a realistic-looking deepfake video makes the rounds—and lately, it feels like there is one every few days—there are warnings that the technology has advanced to the extent that these videos generated by artificial intelligence will be used in disinformation and other attacks.
Typically, deepfake videos are generated by putting a person’s face onto the body of someone else, and the facial movements are manipulated to fit the original video using artificial intelligence. The technology isn’t sophisticated enough yet that people can’t tell the generated videos aren’t real, but the technology is improving rapidly, creating more opportunities for malicious actors to co-opt these applications for their own purposes, said Dr. Mark S. Sherman, technical director, cybersecurity foundations, CERT division, at Carnegie Mellon University Software Engineering Institute.
“There is not a lot of harm yet [with deepfakes], but you can envision how this tech might be used in the future for other kinds of attacks, as the technology matures,” Sherman said at a recent Ai4 Cybersecurity 2021 Summit.
The risk isn’t hypothetical. Back in 2019, an executive in a United Kingdom-based energy company received a phone call from his boss in Germany instructing him to wire €200,000 (US$220,000) to a Hungarian supplier within the hour. The call had actually been a deepfake audio, insurance company Euler Hermes Group SA told the Wall Street Journal. The fake audio had imitated the boss’s voice, tonality, punctuation, and even the German accent.
While it was bad that the company lost money, the damage wasn’t catastrophic. And that is what Sherman worries about. Currently, generating deepfake videos requires a good deal of technical expertise, time, processing power, and data, so it is still out of reach of the average user. Typically, transferring a person’s face onto the video of another person involves collecting thousands of pictures of both people, encoding the images using a deep learning neural network, and calculating features. Transferring the face of a person onto a video of another person could easily wind up involving 175 million parameters and millions of updates, Sherman said.
As a way to improve this process, AI researchers have been exploring shortcuts such as transfer learning, or using a trained neural network for one type of data and applying it to a different dataset. This means the neural network needs to learn only a subset of pixels, making this less time-consuming and resource-intensive. Transfer learning is a common technique used in image recognition, so it is logical that they can be used to create deepfakes, Sherman told IEEE Spectrum.
“You use things you already knew to figure out new features,” Sherman said.
If the datasets are similar enough—the person and the actor impersonating that person, for example—then the same weights can be used for extracting the features. The neural network would be trained on the actor to get key features of the face, such as the eyes and ears. The final layers of the network would be trained using the actual person to “learn how to reconstruct the unique things on the face,” Sherman said.
The use of pre-trained teacher networks and machine learning frameworks means enterprises need to make sure the machine learning supply chain is secure. If someone introduced bad training data, that could lead to a supply of deepfakes, Sherman said. Bad training data—which could mean anything from poor-quality images or images that are of something else—would affect how the teacher network is trained and result in compromised output. Just as enterprises have to pay attention to the supply chain for software and hardware, they will need to pay attention to the provenance of their neural networks and the frameworks they are using.
While it’s true that technology is advancing rapidly, it is still not at the point where people can no longer tell real videos apart from fake ones. There is time for enterprises to make plans on how to proceed to protect against deepfakes.
This is the opportunity to protect the machine learning supply chain and “get ahead of the whole attack,” Sherman said.