With AI Watermarking, Creators Strike Back

Backdoor attacks regulate unauthorized uses of copyrighted or restricted data

21 Apr 2023

iStock

This article is part of our exclusive IEEE Journal Watch series in partnership with IEEE Xplore.

AI models rely on immense data sets to train their complex algorithms, but sometimes the use of those data sets for training purposes can infringe on the rights of the data owners. Yet actually proving that a model used a data set without authorization has been notoriously difficult. However, a new studypublished in IEEE Transactions on Information Forensics and Security, researchers introduce a method for protecting data sets from unauthorized use by embedding digital watermarks into them. The technique could give data owners more say in who is allowed to train AI models using their data.

The simplest way of protecting data sets is to restrict their use, such as with encryption. But doing so would make those data sets difficult to use for authorized users as well. Instead, the researchers focused on detecting whether a given AI model was trained using a particular data set, says the study’s lead author, Yiming Li. Models known to have been impermissibly trained on a data set can be flagged for follow up by the data owner.

Watermarking methods could cause harm, too, though. Malicious actors, for instance, could teach a self-driving system to incorrectly recognize stop signs as speed limit signs.

The technique can be applied to many different types of machine-learning problems, Li said, although the study focuses on classification models, including image classification. First, a small sample of images is selected from a data set and a watermark consisting of a set pattern of altered pixels is embedded into each image. Then the classification label of each watermarked image is changed to correspond to a target label. This establishes a relationship between the watermark and the target label, creating what’s called a backdoor attack. Finally, the altered images are recombined with the rest of the data set and published, where it’s available for consumption by authorized as well as unauthorized users. To verify whether a particular model was trained using the data set, researchers simply run watermarked images through the model and see whether they get back the target label.

The technique can be used on a broad range of AI models. Because AI models naturally learn to incorporate the relationship between images and labels into their algorithm, data-set owners can introduce the backdoor attack into models without even knowing how they function. The main trick is selecting the right number of data samples from a data set to watermark—too few can lead to a weak backdoor attack, while too many can rouse suspicion and decrease the data set’s accuracy for legitimate users.

Watermarking could eventually be used by artists and other creators to opt out of having their work train AI models like image generators. Image generators such as Stable Diffusion and DALL-E 2 are able to create realistic images by ingesting large numbers of existing images and artwork, but some artists have raised concerns about their work being used without explicit permission. While the technique is currently limited by the amount of data required to work properly—an individual artist’s work generally lacks the necessary number of data points—Li says detecting whether an individual artwork helped train a model may be possible in the future. It would require adding a “membership inference” step to determine whether the artwork was part of an unauthorized data set.

The team is also researching whether watermarking can be done in a way that will prevent it from being co-opted for malicious use, Li said. Currently, the ability to watermark a data set can be used by bad actors to cause harm. For example, if an AI model used by self-driving cars were trained to incorrectly interpret stop signs as a signal to instead set the speed limit at 100 miles per hour, that could lead to collisions on the road. The researchers have worked on prevention methods, which they presented as an oral paper at machine-learning conference NeurIPS last year.

Researchers also hope to make the technique more efficient by decreasing the number of watermarked samples needed to establish a successful backdoor attack. Doing so would result in more accurate data sets for legitimate users, as well as an increased ability to avoid detection by AI model builders.

Avoiding detection may be an ongoing battle for those who eventually use watermarking to protect their data sets. There are techniques known as “backdoor defense” that allow model builders to clean a data set prior to use, which reduces watermarking’s ability to establish a strong backdoor attack. Backdoor defenses may be thwarted by a more complex watermarking technique, but that in turn may be beaten by a more sophisticated backdoor defense. As a result, watermarking techniques may need to be updated periodically.

“The backdoor attack and the backdoor defense is like a cat-and-mouse problem,” Li said.

From Your Site Articles

With AI Watermarking, Creators Strike Back

Backdoor attacks regulate unauthorized uses of copyrighted or restricted data

Empower Your Supply Chain

Learn How AI Can Drive Efficiency & Innovation Across Industries with Xometry's Guide

LED Touchscreen Is Also a PV Charger

Perovskite display tech can read fingerprints and gather health data, too

50 Years Later, This Apollo-Era Antenna Still Talks to Voyager 2

DSS-43 is the only antenna that can communicate with the probe

White Paper: Over the Horizon Radar

Principles and challenges of operating in the HF band

Unlock the Most Sophisticated Security Solutions with TII Ghaf Solution

Defend against data breaches and cyber threats by enhancing security resilience without compromising user experience

Unlock the Future of Autonomous Drones with Innovative Secure Runtime Assurance (SRTA)

Navigate the dynamic autonomous systems landscape through AI, machine learning, and cutting-edge SRTA technologies

Spacer Cable Systems Reduce ROW, Increase Reliability, and Improve Power Quality

Solve wind farm design challenges with spacer cable systems

Non-parametric Shape Optimization for Antennas & RF Components using CST Studio Suite®

Tuesday, 28 November 2023, 11am ET

Electronics Cooling Simulation with CST Studio Suite®: From Consumer Devices to Datacenter Racks

Tuesday, 14 November 2023, 11am ET

Efficient EMC Simulation with CST Studio Suite®

Tuesday, 16 January 2024, 11am ET

Topics

Sections

More

For IEEE Members

For IEEE Members

IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum