Illustration of steampunk animals with gears and intelligent appearances.
Illustration: Squidoodle Squidoodle

Are today’s best artificial intelligence (AI) systems as smart as a mouse? A crow? A chimp? A new contest aims to find out. 

The Animal-AI Olympics, which will begin this June, aims to “benchmark the current level of various AIs against different animal species using a range of established animal cognition tasks.” At stake are bragging rights and US $10,000 in prizes.

The project, a partnership between the University of Cambridge’s Leverhulme Centre for the Future of Intelligence and GoodAI, a research institution based in Prague, is a new way to evaluate the progress of AI systems toward what researchers call artificial general intelligence.

While AI systems have recently bested humans in a host of challenging competitions, including the board game Go, the poker game Texas hold’em, and the video game StarCraft, these matchups only proved that AIs were astoundingly good at these particular tasks. AIs have yet to demonstrate the kind of flexible intelligence that enables humans to reason, plan, and act in many different domains.

To learn more about the Animal-AI Olympics, IEEE Spectrum spoke with Matthew Crosby, one of the contest’s organizers and a postdoctoral researcher at the Leverhulme Center and at Imperial College London.

IEEE Spectrum: What was the genesis of this project?

Matthew Crosby: This idea came out of conversations with animal-intelligence researchers. You can take an animal, put it in an environment it’s never seen before, and give it a problem to solve, like getting through some contraption. Often the animal does solve the problem. Whereas if you train an AI to be great at a specific task, it doesn’t even make sense to put it in a new environment. It won’t even try to solve the problem. It just fails to behave.

Spectrum: What makes animal-cognition tests useful and interesting to AI researchers?

Crosby: A lot of animal cognition tests involve training the animal to take food from an apparatus. The researchers want to figure out if it’s succeeding because it’s clever and worked out how the apparatus works, or if it’s just repeating the pattern that it learned through trial and error. Is it succeeding through understanding or rote memorization?

We want to translate this to the AI arena and use these experiments to test for actual understanding of, say, the physics of an environment. Will the AI understand that if the food moves out of sight, it still exists?

Spectrum: Can you give me some examples of specific tests and tasks? 

Crosby: Well, the whole point of the competition is to test the AI on tasks it hasn’t seen before, so we can’t give away too much information. But we’ve picked out some examples to share, which are famous in the animal-intelligence literature. 

Intelligent chimp illustrationIllustration: Squidoodle

In one classic experiment, you put an animal in front of some upside-down, opaque cups. Under one cup, you put some food, and the animal’s job is to retrieve the food. At first you put food under the same cup every time, call it the A cup—that’s equivalent to the training phase in an AI environment. Then in the testing phase, you put the food under the A cup, then take it out, very visibly, and put it under the B cup. Some animals, like chimpanzees, will go straight to the B cup. But a lot of animals will still go to the A cup, because they’ve learned the task through memorization. 

We’re also drawing from Aesop’s Fables. This experiment was taken from a fable where a crow did exactly this. An apparatus with food is floating on water inside a test tube. The crow can’t reach down to get the food; it’s too far down. But the crow can learn to pick up rocks and put them in: The rocks displace the water, the water level rises, and eventually the food is up high enough so that the crow can get it out. In the experiment, you can have an environment with both rocks and pieces of cork, which float and don’t increase the water level at all. The crows learn to put the rocks in, not the cork.

Spectrum:It seems like these tests get at some pretty sophisticated aspects of intelligence, like generalizing knowledge and synthesizing new information. Maybe even creative problem solving.

Crosby: This project captures a lot of elements in AI research that are considered really hard. So far, we haven’t had benchmarks for them, because our benchmarks come from existing games that humans have played in the past. Here, we’re making tasks specifically to test things like generalization and transfer learning. Even if no one does incredibly well in the competition, it will still be useful. 

Spectrum: Is this competition intended partially to puncture the hype around AI? 

Crosby: There has been a lot of hype about AI. The successes are real, like AlphaGo beating the best human Go player in the world. But what that means for general intelligence is a lot harder for people to understand. A lot of the media reports on general intelligence are a bit overblown. 

It’s important to encourage skepticism. AI has made huge progress recently: There are problems that we can solve today that we couldn’t solve only three or four years ago. We just have to be careful about explaining what that means. An AI can be great at one task, but can it solve similar tasks that it hasn’t seen before? This competition is testing for exactly that kind of thing. Maybe we’ll be surprised by how well the AI agents do. But we think the problems we’re putting forward are very hard. 

Spectrum: What’s the procedure and schedule for this competition?

Crosby: We have about 50 tasks from the animal-intelligence literature now. In April, we’ll put out the full packets of information about the competition. In June, the competition goes live, we’ll release everything, people can start working on it. We’ll release lots of training environments with lots of objects. So the agents will know all the environments and objects, all the details will be there for them to learn from. But it’s a generalization challenge, so in the tests they’ll have to use the objects in different ways. In December, we’ll have the final results.

Spectrum: Do you think successful AI agents will have to display common sense

Crosby: There are research groups working on teaching AI an intuitive understanding of physics: What are the rules of the world it’s living in? I hope the people who are working in that area will enter this competition. They’re doing things that are really interesting, but may not have a test-bed that enables them to say, “We’re making good progress here.” I’m hoping they’ll hear about the Animal-AI Olympics and think, “This is our time to shine.”

The Conversation (0)

Will AI Steal Submarines’ Stealth?

Better detection will make the oceans transparent—and perhaps doom mutually assured destruction

11 min read
A photo of a submarine in the water under a partly cloudy sky.

The Virginia-class fast attack submarine USS Virginia cruises through the Mediterranean in 2010. Back then, it could effectively disappear just by diving.

U.S. Navy

Submarines are valued primarily for their ability to hide. The assurance that submarines would likely survive the first missile strike in a nuclear war and thus be able to respond by launching missiles in a second strike is key to the strategy of deterrence known as mutually assured destruction. Any new technology that might render the oceans effectively transparent, making it trivial to spot lurking submarines, could thus undermine the peace of the world. For nearly a century, naval engineers have striven to develop ever-faster, ever-quieter submarines. But they have worked just as hard at advancing a wide array of radar, sonar, and other technologies designed to detect, target, and eliminate enemy submarines.

The balance seemed to turn with the emergence of nuclear-powered submarines in the early 1960s. In a 2015 study for the Center for Strategic and Budgetary Assessment, Bryan Clark, a naval specialist now at the Hudson Institute, noted that the ability of these boats to remain submerged for long periods of time made them “nearly impossible to find with radar and active sonar.” But even these stealthy submarines produce subtle, very-low-frequency noises that can be picked up from far away by networks of acoustic hydrophone arrays mounted to the seafloor.

Keep Reading ↓Show less