AI Researchers Propose a Machine Vision Turing Test

Computers are getting better each year at AI-style tasks, especially those involving vision—identifying a face, say, or telling if a picture contains a certain object. In fact, their progress has been so significant that some researchers now believe the standardized tests used to evaluate these programs have become too easy to pass, and therefore need to be made more demanding.

At issue are the “public data sets” commonly used by vision researchers to benchmark their progress, such as LabelMe at MIT or Labeled Faces in the Wild at the University of Massachusetts, Amherst. The former, for example, contains photographs that have been labeled via crowdsourcing, so that a photo of street scene might have a “car” and a “tree” and a “pedestrian” highlighted and tagged. Success rates have been climbing for computer vision programs that can find these objects, with most of the credit for that improvement going to machine learning techniques such as convolutional networks, often called Deep Learning.

But a group of vision researchers say that simply calling out objects in a photograph, in addition to having become too easy, is simply not very useful; that what computers really need to be able to do is to “understand” what is “happening” in the picture. And so with support from DARPA, Stuart Geman, a professor of applied mathematics at Brown University, and three others have developed a framework for a standardized test that could evaluate the accuracy of a new generation of more ambitious computer vision programs.

An example of an image and corresponding questions proposed by a group of researchers for a Visual Turing Test. The questions would focus on specific regions of the picture and grow in complexity.Image: PNAS

The research was published this week in the Proceedings of the National Academy of Sciences; Geman’s co-authors are all from Johns Hopkins University, in Baltimore, Md., and include his brother, Donald Geman, along with Neil Hallonquist and Laurent Younes.

Their proposed method calls for human test-designers to develop a list of certain attributes that a picture might have, like whether a street scene has people in it, or whether the people are carrying anything or talking with each other. Photographs would first be hand-scored by humans on these criteria; a computer vision system would then be shown the same picture, without the “answers,” to determine if it was able to pick out what the humans had spotted.

Initially, the questions would be rudimentary, asking if there is a person in a designated region of the picture, for example. But the questions would grow in complexity as programs became more sophisticated; a more complicated question might involve the nature of an interaction between different people in the picture.

Eventually, test-developers could end up asking machines for the common-sense, real-world knowledge that has always been the goal of AI researchers. For example, a future question might be, “What will happen to the man in front of the building on whom the piano is about to fall?”

One advantage of the proposed approach, says Geman, is that it would allow a hierarchy of information to be developed for a picture, starting simple and growing more complex. It would also provide the basis for straightforward automated tests that can score how much of that context was being gleaned by the software.

Geman says that because of limitations of today’s data sets, computer vision researchers have been “teaching to the test,” such as creating systems that try to simply detect whether or not a photograph contains a cat. “It’s time to raise the bar,” he says.

Artificial Intelligence “desperately needs a strong set of new challenges that lead to more sophisticated systems.”

Geman conceded that none of the Deep Learning systems currently in use would be able to pass even rudimentary versions of his proposed test. Asked whether the Deep Learning methodology is robust enough to one day be able handle the more complicated contexts and relationships that he interested in, Geman says, “I think the jury is still out on that.”

The proposal by Geman and his colleagues comes at a time when the AI community seems interested in developing better ways of measuring its progress. For example, researchers at a recent conference in Austin, Texas, attempted to come up with a replacement for the well-known Turing Test. They plan to continue their discussions in July in Buenos Aires, at the of the International Joint Conferences on Artificial Intelligence.

Gary Marcus, an NYU researcher who is coordinating those conference sessions, says that while no single test can determine everything about intelligence, the approach being taken by Geman and others is “a very nice, tractable step in the right direction” in a field that “desperately needs a strong set of new challenges that lead to more sophisticated systems.”

From Your Site Articles

Is the Turing Test Dead? - IEEE Spectrum ›

neural networks turing test vision computer vision ai machine learning

Topics

Sections

More

For IEEE Members

For IEEE Members

IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum

AI Researchers Propose a Machine Vision Turing Test

Researchers have proposed a Visual Turing Test in which computers would answer increasingly complex questions about a scene

New Device Generates Power by Beaming Heat to Space

Entrepreneurship Program Expands to More Countries

Video Friday: Lobster Tail Turns Into Robotic Gripper

Related Stories

DeepMind's Robots Play Infinite Table Tennis

Why the Nobel Prize in Physics Went to AI Research

15 Graphs That Explain the State of AI in 2024

Topics

Sections

More

For IEEE Members

For IEEE Members

IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum

Enjoy more free content and benefits by creating an account

Saving articles to read later requires an IEEE Spectrum account

The Institute content is only available for members

Downloading full PDF issues is exclusive for IEEE Members

Downloading this e-book is exclusive for IEEE Members

Access to Spectrum 's Digital Edition is exclusive for IEEE Members

Following topics is a feature exclusive for IEEE Members

Adding your response to an article requires an IEEE Spectrum account

Create an account to access more content and features on IEEE Spectrum , including the ability to save articles to read later, download Spectrum Collections, and participate in conversations with readers and editors. For more exclusive content and features, consider Joining IEEE .

Join the world’s largest professional organization devoted to engineering and applied sciences and get access to all of Spectrum’s articles, archives, PDF downloads, and other benefits. Learn more about IEEE →

Join the world’s largest professional organization devoted to engineering and applied sciences and get access to this e-book plus all of IEEE Spectrum’s articles, archives, PDF downloads, and other benefits. Learn more about IEEE →

Access Thousands of Articles — Completely Free

Create an account and get exclusive content and features: Save articles, download collections, and post comments — all free! For full access and benefits, subscribe to Spectrum.

AI Researchers Propose a Machine Vision Turing Test

Researchers have proposed a Visual Turing Test in which computers would answer increasingly complex questions about a scene

New Device Generates Power by Beaming Heat to Space

Entrepreneurship Program Expands to More Countries

Video Friday: Lobster Tail Turns Into Robotic Gripper

Related Stories

DeepMind's Robots Play Infinite Table Tennis

Why the Nobel Prize in Physics Went to AI Research

15 Graphs That Explain the State of AI in 2024