Researchers have demonstrated a new algorithm for detecting so-called deepfake images—those altered imperceptibly by AI systems, potentially for nefarious purposes. Initial tests of the algorithm picked out phony from undoctored images down to the individual pixel level with between 71 and 95 percent accuracy, depending on the sample data set used. The algorithm has not yet been expanded to include the detection of deepfake videos.
Deepfakes “are images or videos that have been doctored—either you insert something into it or remove something out of it—so it changes the meaning of the picture,” says Amit Roy-Chowdhury, professor of electrical and computer engineering at the University of California, Riverside. The challenge arises because it’s done “in a way so that to the human eye it’s not obvious immediately that it has been manipulated.”
In rapidly developing situations, such as an humanitarian crisis, a business’s product launch, or an election campaign, deepfake videos and images could alter how events play out. Imagine a doctored image in which a political candidate was supposedly committing a violent crime, or a doctored video in which a CEO supposedly confesses to concealing safety problems with her company’s signature product line.
Chowdhury is one of five authors of the deepfake-detecting algorithm, described in a recent IEEE Transactions on Image Processing. He says such detection algorithms could be a powerful tool to fight this new menace of the social media age. But people also need to be careful not to become over-dependent on these algorithms either, he warns. An overly trusted detection algorithm that can be tricked could be weaponized by those seeking to spread false information. A deepfake crafted to exploit a trusted algorithm’s particular weaknesses could effectively result in the algorithm blessing the fake with a certificate of authenticity in the minds of experts, journalists and the public, rendering it even more damaging.
“I think we have to be careful in anything that has to do with AI and machine learning today,” Roy-Chowdhury says. “We need to understand that the results these systems give are probabilistic. And very often the probabilities are not in the range of 0.98 or 0.99. They’re much lower than that. We should not accept them on blind faith. These are hard problems.”
In that sense, he says, deepfakes are really just a new frontier in cybersecurity. And cybersecurity is a perpetual arms race with bad guys and good guys each making advances in often incremental steps.
Roy-Chowdhury says that with their latest work his group has harnessed a set of concepts that already exist separately in the literature, but which they have combined in novel and potentially powerful way.
One component of the algorithm is a variety of a so-called “recurrent neural network,” which splits the image in question into small patches and looks at those patches pixel by pixel. The neural network has been trained by letting it examine thousands of both deepfake and genuine images, so it has learned some of the qualities that make fakes at stand out at the single-pixel level.
Roy-Chowdhury says the boundaries around the doctored portion of an image are often what contain telltale signs of manipulation. “If an object is inserted, it is often the boundary regions that have certain characteristics,” he says. “So the person who’s tampering the image will probably try to do it so that the boundary is very smooth. What we found out is the tampered images were often smoother than in natural images. Because the person who did the manipulation was going out of his way to make it very smooth.”
Another portion of the algorithm, on a parallel track to the part looking at single pixels, passes the whole image through a series of encoding filters—almost as if it were performing an image compression, as when you click the “compress image” box when saving a TIFF or a JPEG. These filters, in a mathematical sense, enable the algorithm to consider the entire image at larger, more holistic levels.
The algorithm then compares the output of the pixel-by-pixel and higher-level encoding filter analyses. When these parallel analyses trigger red flags over the same region of an image, it is then tagged as a possible deepfake.
For example, say that a stock image of a songbird has been pasted onto a picture of an empty tree branch. The pixel-by-pixel algorithm in this case might flag the pixels around the bird’s claws as problematic, while the encoder algorithm might spot patterns in the larger image (noticing, perhaps, other boundary problems or anomalies at the larger-scale level). So long as both of these neural nets flagged the same region of the image around the bird, then Roy-Chowdhury’s group’s algorithm would categorize the bird-and-branch photo as a possible deepfake.
Roy-Chowdhury says that the algorithm now needs to be expanded to handle video. Such a next-level algorithm, he says, would potentially include how the image evolves frame-by-frame and whether any detectable patterns can be discerned from that evolution in time.
Given the urgency of deepfake detection, as hostile actors around the world increasingly seek to manipulate political events using false information, Roy-Chowdhury encourages researchers to contact his group for code or pointers toward further developing this algorithm for deepfake detection in the wild.