Amateurs’ Al Tells Real Rembrandts From Fakes

A new AI algorithm may crack previously inaccessible image-recognition and analysis problems—especially those stymied by AI training sets that are too small, or whose individual sample images are too big and full of high-resolution detail that AI algorithms cannot process. Already, the new algorithm can detect forgeries of one famous artist’s work, and its creators are actively searching for other areas where it could potentially improve our ability to transform small data sets into ones large enough to train an AI neural network.

According to two amateur AI researchers, whose study is now under peer review at IEEE Transactions on Neural Networks and Learning Systems, the concept of entropy, borrowed from thermodynamics and information theory, may help AI systems uncover fake works of art.

In physical systems such as boiling pots of water and black holes, entropy concerns the amount of disorder contained within a given volume. In an image file, entropy is defined as the amount of useful, nonredundant information the file contains.

“[Entropy is] a measure of diversity of information in a signal,” says Steven Frank, an IEEE member and part-time AI coder (and full-time patent attorney) based in Framingham, Mass. “The idea is if you have a message that’s all ‘1’s, you have no entropy at all. There’s no diversity at all. It’s just ‘1’s. If you have a completely random sequence, then you have very high entropy and high diversity…. You can’t compress it or describe it in any number of bits smaller than the message.”

So an individual image’s entropy rating is a kind of numerical score of how much digital diversity it contains—how monotonic or algorithmic the image is (low entropy score) versus stochastic and unpredictable (high entropy). It’s no qualitative or aesthetic measure, to be sure. Art critics won’t have much use for metrics like this. But perhaps entropy could be relevant for a computer. Frank argues that, in fact, image entropy may open the door to solving a longstanding problem in using AI to process high-resolution images.

This is because high-resolution images in megabyte- or gigabyte-size ranges are just too vast to be processed by an AI neural network—such as the so-called convolutional neural networks (CNNs) common in image-recognition algorithms.

Rembrandt van Rijn, Self-Portrait, 1659 Researchers trained AI neural nets on small portions of Rembrandt’s portraits (including his Self-Portrait, pictured here) to test an art-forgery-recognition algorithm.Image: National Gallery of Art

In 1935, a catalog of the Dutch painter Rembrandt van Rijn’s work listed just 611 paintings.

However, current estimates put the actual number at about half as many paintings. (This is because there are many Rembrandt imitations, copies, and forgeries out in the world, too.)

But even if all 611 Rembrandts and ersatz Rembrandts could be used to train a Rembrandt-recognizing CNN, that would still amount to just 12 percent of the minimum number of training samples that AI experts say are needed for image recognition.

If AI were to help pick out authentic Rembrandts from the fakes, it’d need at least 5,000 Rembrandt images and ideally another 5,000 or more sample images from each of the leading forgers, imitators, and copiers of Rembrandt’s work.

Since no such trove exists, AI hasn't been much help in solving the attribution problem for Rembrandt or other famous artists.

Frank wondered, Could entropy solve both problems at once? Both the problem of images that are too high-res for a CNN and the problem of too small a data set to train a CNN?

For any given image, say Rembrandt’s 1654 Portrait of Jan Six, a simple computer program (no AI needed yet) can calculate its entropy rating. Say the Jan Six image file has an entropy score of...oh, why not...6. Now, Frank and his collaborator (and wife) Andrea Frank realized, they could run a simple script on the Jan Six image that picks out all 100- by 100-pixel tiles of that larger image that also have an entropy rating of 6. Then, it could pick out all 200- by 200-pixel tiles that rate a 6. And do the same for 400- by 400-pixel tiles.

Those 100-pixel, 200-pixel and 400-pixel tiles then, in a mathematical sense at least, contain the same informational disorder as their parent image. So, the Franks wondered, might a neural net trained on the tiles (not the full-size originals) “teach” the neural net how to discern a Rembrandt from a fake?

“If you really want to see what differentiates Rembrandt, you have to look at a bigger piece of the canvas and at a larger compositional level.”

Their method split the Rembrandt images into some 13,000 tiles from his commonly accepted oeuvre of portraits. Then, they trained their CNN on those tiles instead of the entire original paintings. To test their neural net, they ran a series of known fakes and known canonical works (none of which were in the original training set) to see if the Rembrandt CNN could tell the difference. They report a success rate of 90.4 percent.

One surprising result from the analysis came in figuring out what parts of an image proved to be most important to identifying a painting as either by Rembrandt or not.

“The tiles that are limited to a brushstroke level are actually not that effective for discriminating between a Rembrandt and another portrait in the style of Rembrandt,” says Frank. “On the other hand, at larger resolutions, say at the size of a head, the CNN is quite good at being able to pick a real Rembrandt from his imitators or an artist of a similar style.”

“That suggests to us that Rembrandt’s contemporaries were probably imitating him pretty well at a brushstroke level,” he adds. “If you really want to see what differentiates Rembrandt, you have to look at a bigger piece of the canvas and at a larger compositional level.”

The same AI technique may also prove useful for some kinds of medical image-classification problems, the researchers say. Radiology images, for example, are notoriously large—hundreds of megabytes or more in size. CNNs can’t handle files that big, so radiologists must scale down the image to low-res snapshots (losing information in the process) or hand-pick representative samples to compare and contrast—which creates more busywork for radiologists.

“Our system can enable much higher performance and more accurate classification, because instead of looking at this enormous image and dumbing it down to a low-resolution image, it can look at the high-resolution pieces that are medically relevant,” Frank says.

He adds, however, that unlike a Rembrandt painting, there could be medically significant details in other portions of an image that the entropy method doesn’t concentrate on. So, a radiologist would need to recognize the limitations of the algorithm as well.

As one tool among many in an image analysis program, the entropy method could provide useful information to a radiologist who knows its strengths and weaknesses. Theoretically, it could someday help AI find rare conditions for which not enough training images are available.

In the meantime, the Franks have no plans to patent or commercialize their algorithm. And for now, it provides a clever proof-of-concept for applying entropy to the stubborn problem of detecting genuine works of art from a sea of imitators.

From Your Site Articles

This AI Can Spot an Art Forgery - IEEE Spectrum ›

neural networks algorithms training data software ai image recognition art

Topics

Sections

More

For IEEE Members

For IEEE Members

IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum

Amateurs’ Al Tells Real Rembrandts From Fakes

In their spare time, a Massachusetts couple programmed a system that they say accurately identifies Rembrandts 90 percent of the time

Smart Tile Monitors Crowds Without Power Source

7 Bell Labs Breakthroughs Honored as IEEE Milestones

Video Friday: Musculoskeletal Robot Dog

Related Stories

Why IT Projects Repeat Costly Mistakes

Trillions Spent and Big Software Projects Are Still Failing

Airflow: From Stagnation to Millions of Downloads

Topics

Sections

More

For IEEE Members

For IEEE Members

IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum

Enjoy more free content and benefits by creating an account

Saving articles to read later requires an IEEE Spectrum account

The Institute content is only available for members

Downloading full PDF issues is exclusive for IEEE Members

Downloading this e-book is exclusive for IEEE Members

Access to Spectrum 's Digital Edition is exclusive for IEEE Members

Following topics is a feature exclusive for IEEE Members

Adding your response to an article requires an IEEE Spectrum account

Create an account to access more content and features on IEEE Spectrum , including the ability to save articles to read later, download Spectrum Collections, and participate in conversations with readers and editors. For more exclusive content and features, consider Joining IEEE .

Join the world’s largest professional organization devoted to engineering and applied sciences and get access to all of Spectrum’s articles, archives, PDF downloads, and other benefits. Learn more about IEEE →

Join the world’s largest professional organization devoted to engineering and applied sciences and get access to this e-book plus all of IEEE Spectrum’s articles, archives, PDF downloads, and other benefits. Learn more about IEEE →

Access Thousands of Articles — Completely Free

Create an account and get exclusive content and features: Save articles, download collections, and post comments — all free! For full access and benefits, subscribe to Spectrum.

Amateurs’ Al Tells Real Rembrandts From Fakes

In their spare time, a Massachusetts couple programmed a system that they say accurately identifies Rembrandts 90 percent of the time

Smart Tile Monitors Crowds Without Power Source

7 Bell Labs Breakthroughs Honored as IEEE Milestones

Video Friday: Musculoskeletal Robot Dog

Related Stories

Why IT Projects Repeat Costly Mistakes

Trillions Spent and Big Software Projects Are Still Failing

Airflow: From Stagnation to Millions of Downloads