UC Berkeley's AI-Powered Robot Teaches Itself to Drive Off-Road

This article was originally published on UC Berkeley’s BAIR Blog.

Look at the images above. If I asked you to bring me a picnic blanket to the grassy field, would you be able to? Of course. If I asked you to bring over a cart full of food for a party, would you push the cart along the paved path or on the grass? Obviously the paved path.

While the answers to these questions may seem obvious, today’s mobile robots would likely fail at these tasks: They would think the tall grass is the same as a concrete wall, and wouldn’t know the difference between a smooth path and bumpy grass. This is because most mobile robots think purely in terms of geometry: They detect where obstacles are, and plan paths around these perceived obstacles in order to reach the goal. This purely geometric view of the world is insufficient for many navigation problems. Geometry is simply not enough.

UC Berkeley's BADGR mobile robot BADGR consists of a Clearpath Jackal mobile platform equipped with an NVIDIA Jetson TX2 computer, IMU, GPS, and wheel encoders. Forward-facing cameras, a 2D lidar, and a compass were added to the standard configuration.Photo: UC Berkeley

Can we enable robots to reason about navigational affordances directly from images? To explore that question, we developed a robot that can autonomously learn about physical attributes of the environment through its own experiences in the real-world, without any simulation or human supervision. We call our robot learning system BADGR: the Berkeley Autonomous Driving Ground Robot.

BADGR works by:

Autonomously collecting data
Automatically labeling the data with self-supervision
Training an image-based neural network predictive model
Using the predictive model to plan into the future and execute actions that will lead the robot to accomplish the desired navigational task

Data Collection

BADGR robot BADGR autonomously collecting data in off-road (left) and urban (right) environments.Image: UC Berkeley

BADGR needs a large amount of diverse data in order to successfully learn how to navigate. The robot collects data using a simple time-correlated random walk controller. As the robot collects data, if it experiences a collision or gets stuck, it executes a simple reset controller and then continues collecting data.

Self-Supervised Data Labeling

Next, BADGR goes through the data and calculates labels for specific navigational events, such as the robot’s position and if the robot collided or is driving over bumpy terrain, and adds these event labels back into the dataset. These events are labeled by having a person write a short snippet of code that maps the raw sensor data to the corresponding label. As an example, the code snippet for determining if the robot is on bumpy terrain looks at the IMU sensor and labels the terrain as bumpy if the angular velocity magnitudes are large.

We describe this labeling mechanism as self-supervised because although a person has to manually write this code snippet, the code snippet can be used to label all existing and future data without any additional human effort.

Neural Network Predictive Model

BADGR The neural network predictive model at the core of BADGR.Image: UC Berkeley

BADGR then uses the data to train a deep neural network predictive model. The neural network takes as input the current camera image and a future sequence of planned actions, and outputs predictions of the future relevant events (such as if the robot will collide or drive over bumpy terrain). The neural network predictive model is trained to predict these future events as accurately as possible.

Planning and Navigating

BADGR robot BADGR predicting which actions lead to bumpy terrain (left) or collisions (right).Image: UC Berkeley

When deploying BADGR, the user first defines a reward function that encodes the specific task they want the robot to accomplish. For example, the reward function could encourage driving towards a goal while discouraging collisions or driving over bumpy terrain. BADGR then uses the trained predictive model, current image observation, and reward function to plan a sequence of actions that maximize reward. The robot executes the first action in this plan, and BADGR continues to alternate between planning and executing until the task is complete.

In our experiments, we studied how BADGR can learn about physical attributes of the environment at a large off-site facility near UC Berkeley. We compared our approach to a geometry-based policy that uses lidar to plan collision-free paths. (Note that BADGR only uses the onboard camera.)

BADGR robot BADGR successfully reaches the goal while avoiding collisions and bumpy terrain, while the geometry-based policy is unable to avoid bumpy terrain.Image: UC Berkeley

We first considered the task of reaching a goal GPS location while avoiding collisions and bumpy terrain in an urban environment. Although the geometry-based policy always succeeded in reaching the goal, it failed to avoid the bumpy grass. BADGR also always succeeded in reaching the goal, and succeeded in avoiding bumpy terrain by driving on the paved paths. Note that we never told the robot to drive on paths; BADGR automatically learned from the onboard camera images that driving on concrete paths is smoother than driving on the grass.

BADGR BADGR successfully reaches the goal while avoiding collisions, while the geometry-based policy is unable to make progress because it falsely believes the grass is an untraversable obstacle.Image: UC Berkeley

We also considered the task of reaching a goal GPS location while avoiding both collisions and getting stuck in an off-road environment. The geometry-based policy nearly never crashed or became stuck on grass, but sometimes refused to move because it was surrounded by grass which it incorrectly labeled as untraversable obstacles.

BADGR almost always succeeded in reaching the goal by avoiding collisions and getting stuck, while not falsely predicting that all grass was an obstacle. This is because BADGR learned from experience that most grass is in fact traversable.

BADGR BADGR’s navigation capability improves as it gathers more data.Image: UC Berkeley

In addition to being able to learn about physical attributes of the environment, a key aspect of BADGR is its ability to continually self-supervise and improve the model as it gathers more and more data. To demonstrate this capability, we ran a controlled study in which BADGR gathers and trains on data from one area, moves to a new target area, fails at navigating in this area, but then eventually succeeds in the target area after gathering and training on additional data from that area.

BADGR robot navigating in novel environments

BADGR robot navigating in novel environments BADGR navigating in novel environments.Images: UC Berkeley

This experiment not only demonstrates that BADGR can improve as it gathers more data, but also that previously gathered experience can actually accelerate learning when BADGR encounters a new environment. And as BADGR autonomously gathers data in more and more environments, it should take less and less time to successfully learn to navigate in each new environment.

We also evaluated how well BADGR navigates in novel environments—ranging from a forest to urban buildings—not seen in the training data. This result demonstrates that BADGR can generalize to novel environments if it gathers and trains on a sufficiently large and diverse dataset.

The key insight behind BADGR is that by autonomously learning from experience directly in the real world, BADGR can learn about navigational affordances, improve as it gathers more data, and generalize to unseen environments. Although we believe BADGR is a promising step towards a fully automated, self-improving navigation system, there are a number of open problems that remain: How can the robot safely gather data in new environments, or adapt online as new data streams in, or cope with non-static environments, such as humans walking around?

We believe that solving these and other challenges is crucial for enabling robot learning platforms to learn and act in the real world.

Gregory Kahn is a PhD candidate in the Berkeley AI Research (BAIR) Lab at UC Berkeley advised by Professor Sergey Levine and Professor Pieter Abbeel. His main research goal is to develop algorithms that enable robots to operate in the real world. His current research is on deep reinforcement learning for mobile robots.

From Your Site Articles

Robot Videos: Deep Robotics, Surgery Robot, Lunar Exploration - IEEE Spectrum ›

robot software uc berkeley autonomous vehicles guest articles machine learning

Topics

Sections

More

For IEEE Members

For IEEE Members

IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum

UC Berkeley's AI-Powered Robot Teaches Itself to Drive Off-Road

BADGR trains its deep neural network using data it gathers from real-world environments

Data Collection

Self-Supervised Data Labeling

Neural Network Predictive Model

Planning and Navigating

Tips for How to Think Like an Entrepreneur

Video Friday: Robots Throw, Catch, and Hit a Baseball

Deep Fission Plans to Sink Nuclear Reactors Underground

Related Stories

12 Robotics Teams Will Hunt For (Virtual) Subterranean Artifacts

AI and Robots Are a Minefield of Cognitive Biases

Video Friday: These Robots Have Made 1 Million Autonomous Deliveries

Topics

Sections

More

For IEEE Members

For IEEE Members

IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum

Enjoy more free content and benefits by creating an account

Saving articles to read later requires an IEEE Spectrum account

The Institute content is only available for members

Downloading full PDF issues is exclusive for IEEE Members

Downloading this e-book is exclusive for IEEE Members

Access to Spectrum 's Digital Edition is exclusive for IEEE Members

Following topics is a feature exclusive for IEEE Members

Adding your response to an article requires an IEEE Spectrum account

Create an account to access more content and features on IEEE Spectrum , including the ability to save articles to read later, download Spectrum Collections, and participate in conversations with readers and editors. For more exclusive content and features, consider Joining IEEE .

Join the world’s largest professional organization devoted to engineering and applied sciences and get access to all of Spectrum’s articles, archives, PDF downloads, and other benefits. Learn more about IEEE →

Join the world’s largest professional organization devoted to engineering and applied sciences and get access to this e-book plus all of IEEE Spectrum’s articles, archives, PDF downloads, and other benefits. Learn more about IEEE →

Access Thousands of Articles — Completely Free

Create an account and get exclusive content and features: Save articles, download collections, and post comments — all free! For full access and benefits, subscribe to Spectrum.

UC Berkeley's AI-Powered Robot Teaches Itself to Drive Off-Road

BADGR trains its deep neural network using data it gathers from real-world environments

Data Collection

Self-Supervised Data Labeling

Neural Network Predictive Model

Planning and Navigating

Tips for How to Think Like an Entrepreneur

Video Friday: Robots Throw, Catch, and Hit a Baseball

Deep Fission Plans to Sink Nuclear Reactors Underground

Related Stories

12 Robotics Teams Will Hunt For (Virtual) Subterranean Artifacts

AI and Robots Are a Minefield of Cognitive Biases

Video Friday: These Robots Have Made 1 Million Autonomous Deliveries