The December 2022 issue of IEEE Spectrum is here!

Close bar

Understanding Causality Is the Next Challenge for Machine Learning

Teaching robots to understand "why" could help them transfer their knowledge to other environments

3 min read
The TriFinger robotic platform.
The researcher's work uses the open-source TriFinger robotic platform in a simulated robotics manipulation environment.
Photo: Max Planck Institute For Intelligent Systems/MILA/ETH Zurich

“Causality is very important for the next steps of progress of machine learning,” said Yoshua Bengio, a Turing Award-wining scientist known for his work in deep learning, in an interview with IEEE Spectrum in 2019. So far, deep learning has comprised learning from static datasets, which makes AI really good at tasks related to correlations and associations. However, neural nets do not interpret cause-and effect, or why these associations and correlations exist. Nor are they particularly good at tasks that involve imagination, reasoning, and planning. This, in turn, limits AI from being able to generalize their learning and transfer their skills to another related environment.

The lack of generalization is a big problem, says Ossama Ahmed, a master’s student at ETH Zurich who has worked with Bengio’s team to develop a robotic benchmarking tool for causality and transfer learning. “Robots are [often] trained in simulation, and then when you try to deploy [them] in the real world…they usually fail to transfer their learned skills. One of the reasons is that the physical properties of the simulation are quite different from the real world,” says Ahmed. The group’s tool, called CausalWorld, demonstrates that with some of the methods currently available, the generalization capabilities of robots aren’t good enough—at least not to the extent that “we can deploy [them] safely in any arbitrary situation in the real world,” says Ahmed.

The paper on CausalWorld, available as a preprint, describes benchmarks in a simulated robotics manipulation environment using the open-source TriFinger robotics platform. The main purpose of CausalWorld is to accelerate research in causal structure and transfer learning using this simulated environment, where learned skills could potentially be transferred to the real world. Robotic agents can be given tasks that comprise pushing, stacking, placing, and so on, informed by how children have been observed to play with blocks and learn to build complex structures. There is a large set of parameters, such as weight, shape, and appearance of the blocks and the robot itself, on which the user can intervene at any point to evaluate the robot’s generalization capabilities.

In their study, the researchers gave the robots a number of tasks ranging from simple to extremely challenging, based on three different curricula. The first involved no environment changes; the second had changes to a single variable; and the third allowed full randomization of all variables in the environment. They observed that as the curricula got more complex, the agents showed less ability to transfer their skills to the new conditions.

“If we continue scaling up training and network architectures beyond the experiments we report, current methods could potentially solve more of the block stacking environments we propose with CausalWorld,” points out Frederik Träuble, one of the contributors to the study. Träuble adds that “What’s actually interesting is that we humans can generalize much, much quicker [and] we don’t need such a vast amount of experience… We can learn from the underlying shared rules of [certain] environments…[and] use this to generalize better to yet other environments that we haven’t seen.”

A standard neural network, on the other hand, would require insane amounts of experience with myriad environments in order to do the same. “Having a model architecture or method that can learn these underlying rules or causal mechanisms, and utilize them could [help] overcome these challenges,” Träuble says.

CausalWorld’s evaluation protocols, say Ahmed and Träuble, are more versatile than those in previous studies because of the possibility of “disentangling” generalization abilities. In other words, users are free to intervene on a large number of variables in the environment, and thus draw systemic conclusions about what the agent generalizes to—or doesn’t. The next challenge, they say, is to actually use the tools available in CausalWorld to build more generalizable systems.

Despite how dazzled we are by AI’s ability to perform certain tasks, Yoshua Bengio, in 2019, estimated that present-day deep learning is less intelligent than a two-year-old child. Though the ability of neural networks to parallel-process on a large scale has given us breakthroughs in computer vision, translation, and memory, research is now shifting to developing novel deep architectures and training frameworks for addressing tasks like reasoning, planning, capturing causality, and obtaining systematic generalization. “I believe it’s just the beginning of a different style of brain-inspired computation,” Bengio said, adding, “I think we have a lot of the tools to get started.”

The Conversation (0)

Will AI Steal Submarines’ Stealth?

Better detection will make the oceans transparent—and perhaps doom mutually assured destruction

11 min read
A photo of a submarine in the water under a partly cloudy sky.

The Virginia-class fast attack submarine USS Virginia cruises through the Mediterranean in 2010. Back then, it could effectively disappear just by diving.

U.S. Navy

Submarines are valued primarily for their ability to hide. The assurance that submarines would likely survive the first missile strike in a nuclear war and thus be able to respond by launching missiles in a second strike is key to the strategy of deterrence known as mutually assured destruction. Any new technology that might render the oceans effectively transparent, making it trivial to spot lurking submarines, could thus undermine the peace of the world. For nearly a century, naval engineers have striven to develop ever-faster, ever-quieter submarines. But they have worked just as hard at advancing a wide array of radar, sonar, and other technologies designed to detect, target, and eliminate enemy submarines.

The balance seemed to turn with the emergence of nuclear-powered submarines in the early 1960s. In a 2015 study for the Center for Strategic and Budgetary Assessment, Bryan Clark, a naval specialist now at the Hudson Institute, noted that the ability of these boats to remain submerged for long periods of time made them “nearly impossible to find with radar and active sonar.” But even these stealthy submarines produce subtle, very-low-frequency noises that can be picked up from far away by networks of acoustic hydrophone arrays mounted to the seafloor.

Keep Reading ↓Show less
{"imageShortcodeIds":["30133857"]}