The December 2022 issue of IEEE Spectrum is here!

Close bar

Facebook’s DensePose Tech Raises Concerns About Potential Misuse

Facebook’s DensePose technology lets anyone turn 2D images of people into 3D models

3 min read
Still image from Facebook's DensePose video.
Image: Facebook

In early 2018, Facebook’s AI researchers unveiled a deep-learning system that can transform 2D photo and video images of people into 3D mesh models of those human bodies in motion. Last month, Facebook publicly shared the code for its “DensePose” technology, which could be used by Hollywood filmmakers and augmented reality game developers—but maybe also by those seeking to build a surveillance state.

DensePose goes beyond basic object recognition. Besides detecting humans in pictures, it can also make 3D models of their bodies by estimating the positions of their torsos and limbs. Those models can then enable the technology to create real-time 3D re-creations of human movement in 2D videos. For example, it could produce videos that show models of several people kicking soccer balls or a single individual riding on a motorcycle.

This work could prove useful for “graphics, augmented reality, or human-computer interaction, and could also be a stepping-stone towards general 3D-based object understanding,” according to the Facebook AI Research (FAIR) paper published in January 2018.

But there is a “troubling implication of this research” that could enable “real-time surveillance,” said Jack Clark, strategy and communications director at OpenAI, a nonprofit AI research company, in his popular newsletter, called Import AI. Clark first discussed the implications of Facebook’s DensePose paper in the February issue of his newsletter, and followed up in June after Facebook released the DensePose code on the software development platform GitHub.

“The same system has wide utility within surveillance architectures, potentially letting operators analyze large groups of people to work out if their movements are problematic or not—for instance, such a system could be used to signal to another system if a certain combination of movements are automatically labelled as portending a protest or a riot,” Clark wrote in his newsletter.

As always, the deep-learning algorithms behind DensePose needed some help from humans in the beginning. Facebook researchers first enlisted human annotators to create a training data set by manually labeling certain points on 50,000 images of human bodies. To make that job easier for the annotators and try to improve their accuracy, the researchers broke the task of labeling down into body segments such as head, torso, limbs, hands, and feet. They also “unfolded” each body part to present multiple viewpoints without requiring the annotators to manually rotate the image to get a better view.

Still, the annotators were only asked to label 100 to 150 points per image. To complete the training data, Facebook researchers used an algorithm to estimate and fill in the rest of the points that corresponded between the 2D images and the 3D mesh models.

The result is a system that can perform the 2D to 3D conversion at a rate of “20-26 frames per second for a 240 × 320 image or 4-5 frames per second for a 800 × 1100 image,” Facebook researchers wrote in their paper. In other words, it’s generally capable of creating 3D models of humans in a 2D video in real time.

Facebook’s researchers do not specifically mention surveillance as a possible application of DensePose alongside the many they do list in their paper. But because Facebook has put its technology out there, someone could adapt DensePose for surveillance or law enforcement, if they so desired.

In fact, other research groups have been working on similar systems to estimate human body poses for security applications: a group of U.K. and Indian researchers have been developing a drone-mounted system aimed at detecting violence within crowds of people. And there are clearly law enforcement agencies and governments around the world interested in potentially harnessing such technology, for good or for ill.

Clark described his hope of seeing the FAIR group—and AI researchers in general—publicly discuss the implications of their work. He wondered if Facebook’s researchers considered the surveillance possibility and whether or not Facebook has an internal process for weighing the risks of publicly releasing such technology. In the case of DensePose, it’s a question that only Facebook can answer. The company did not respond to a request for comment.

“As a community we—including organizations like OpenAI—need to be better about dealing publicly with the information hazards of releasing increasingly capable systems, lest we enable things in the world that we’d rather not be responsible for,” Clark said.

The Conversation (0)

Will AI Steal Submarines’ Stealth?

Better detection will make the oceans transparent—and perhaps doom mutually assured destruction

11 min read
A photo of a submarine in the water under a partly cloudy sky.

The Virginia-class fast attack submarine USS Virginia cruises through the Mediterranean in 2010. Back then, it could effectively disappear just by diving.

U.S. Navy

Submarines are valued primarily for their ability to hide. The assurance that submarines would likely survive the first missile strike in a nuclear war and thus be able to respond by launching missiles in a second strike is key to the strategy of deterrence known as mutually assured destruction. Any new technology that might render the oceans effectively transparent, making it trivial to spot lurking submarines, could thus undermine the peace of the world. For nearly a century, naval engineers have striven to develop ever-faster, ever-quieter submarines. But they have worked just as hard at advancing a wide array of radar, sonar, and other technologies designed to detect, target, and eliminate enemy submarines.

The balance seemed to turn with the emergence of nuclear-powered submarines in the early 1960s. In a 2015 study for the Center for Strategic and Budgetary Assessment, Bryan Clark, a naval specialist now at the Hudson Institute, noted that the ability of these boats to remain submerged for long periods of time made them “nearly impossible to find with radar and active sonar.” But even these stealthy submarines produce subtle, very-low-frequency noises that can be picked up from far away by networks of acoustic hydrophone arrays mounted to the seafloor.

Keep Reading ↓Show less