How JPL's Team CoSTAR Won the DARPA SubT Challenge: Urban Circuit Systems Track

Team CoSTAR’s journey from space robotics to a winning team of underground robots

17 min read
Photo: Evan Ackerman/IEEE Spectrum

In 2017, a team at NASA’s Jet Propulsion Laboratory in Pasadena, Calif., was in the process of prototyping some small autonomous robots capable of exploring caves and subsurface voids on the Moon, Mars, and Titan, Saturn’s largest moon. Our goal was the development of new technologies to help us solve one of humanity’s most significant questions: is there or has there been life beyond Earth?

The more we study the surfaces of planetary bodies in our solar system, the more we are compelled to voyage underground to seek answers to this question. Planetary subsurface voids are not only one of the most likely places to find both signs of life, past and present, but thanks to the shelter they provide, are also one of the main candidates for future human habitation. While we were working on various technologies for cave exploration at JPL, DARPA launched the latest in its series of Grand Challenges, the Subterranean Challenge, or SubT. Compared to earlier events that focused on on-road driving and humanoid robots in pre-defined disaster relief scenarios, the focus of SubT is the exploration of unknown and extreme underground environments. Even though SubT is about exploring such environments on Earth, we can use the competition as an analog to help us learn how to explore unknown environments on other planetary bodies. 

From the beginning, the JPL team forged partnerships with four other institutions offering complementary capabilities to collectively address a daunting list of technical challenges across multiple domains in this competition. In addition to JPL’s experience in deploying robust and resilient autonomous systems in extreme and uncertain environments, the team also included Caltech, with its specialization in mobility, MIT, with its expertise in large-scale mapping, and KAIST (South Korea) and LTU (Sweden), experts in fast drones in underground environments. The more far-flung partnerships were the result of existing research collaborations, a typical pattern in robotics research. We also partnered with a range of companies who supported us with robot platforms and sensors. The shared philosophy of building Collaborative SubTerranean Autonomous Robots led to the birth of Team CoSTAR.

Our approach to the SubT Challenge

The SubT Challenge is designed to encourage progress in four distinct robotics domains: mobility (how to get around), perception (how to make sense of the world), networking (how to get the data back to the server by the end of the mission), and autonomy (how to make decisions). The competition rules and structure reflect meaningful real-world scenarios in underground environments including tunnels, urban areas, and caves. 

To be successful in the SubT Challenge requires a holistic solution that balances coverage of each domain and a recognition of how each is intertwined with the others. For example, the robots need to be small enough to travel through narrow passages, but large enough to carry the sensors and computers necessary to make autonomous decisions and while navigating in perceptually-degraded parts of the course, meaning dark, dusty or smoke-filled. There’s also the challenge of power and energy: The robots need to be quick and energy-efficient to meet the endurance requirements and traverse multiple kilometers per hour in extreme environments. At the same time, autonomous onboard decision making and large-scale mapping is the single biggest power demand. Such challenges are amplified on flying vehicles and require more dramatic trade-offs between flying time, size, and the autonomous capabilities.

Our answer to this call for versatility is to present a team of AI-powered robots, comprising multiple heterogeneous platforms, to handle the various challenges of each course. To enable modularity, all our robots are equipped with the same modular autonomy software, called NeBula (Networked Belief-aware Perceptual Autonomy). NeBula is specifically designed to address stochasticity and uncertainty in various elements of the mission, including sensing, environment, motion, system health, and communication, among others. With a mix of wheeled, legged, tracked, and flying vehicles, our team relies on a decision-making process that translates mission specifications, risk, and time into strategies that adaptively prescribe which robot should be dispatched to which part of the course and when.

Team CoSTAR robot familyA team of robots with heterogeneous capabilities handles various challenges of an unknown extreme environment.Image: Team CoSTAR

Let the exploration begin!

The hallmark of Team CoSTAR’s first year leading up to the SubT Tunnel Circuit was a series of fast iterations through potential robot configurations. Every few weeks, we would make a major adjustment to our overall solution architecture based on what we learned in the previous iteration. These changes could potentially be as major as changing our overall concept of operations, e.g., how many robots in what formation should be part of the solution. This required high-levels of adaptivity and agility in our solution development process and team culture.

Testing in representative environments offered us a crucial advantage in the competition. Our “local” test site (a four-hour drive from home) was an abandoned gold mine open to tourists called Eagle Mine. Its narrow passageways and dusty interior compelled us to invest in techniques for precise motion planning, dust mitigation, and flying in perceptually-degraded environments. For smaller-scale integration testing, we used what resources we had on the JPL campus. That meant setting up a series of inflatable tunnels in the Mars Yard, a dusty, rocky field used for rehearsing mobility sequences for Mars rovers. By joining multiple tunnels together, we could make test courses of varying lengths and widths, allowing us to make rapid progress, especially on our drones’ performance in dusty environments.

Team CoSTAR tested their robots at NASA's Mars yardTeam CoSTAR created a series of inflatable tunnels in the JPL Mars Yard to test certain specific autonomy capabilities without needing to travel to mines or caves.Photos: Team CoSTAR

Hybrid aerial-ground vehicles, platforms that roll or fly depending on obstacles in the local vicinity, were a major focus in the lead-up to our first test-run, the Systems Test and Integration eXercise (STIX), held by DARPA in Idaho Springs, Colorado, in April 2019. The robot that we developed, called Rollocopter, offers the potential for greater coverage of a given area, as it only flies when it needs to, such as to hop over a rubble pile. On flatter terrain, Rollocopter can travel in an energy-efficient ground-rolling mode. Rollocopter made its debut alongside a wheeled Husky robot, from Clearpath Robotics, at the STIX event, flying and driving in a sensing-degraded environment with high levels of dust.

Rollocopter, a robot used by Team CoSTARThe Rollocopter and Husky on their debut outing at the DARPA competition dry-run event called STIX.Photos: Team CoSTAR

Three months before the first scored SubT event, the Tunnel Circuit, DARPA revealed that the competition was to be held at a research coal mine in Pittsburgh, Pa. This mine appeared to have less dust, fewer obstacles, and wider passages than our test environments, but it was also more demanding due to its wet and muddy terrain and large, complex layout. This was a big surprise for the team, and we had to shift all kinds of things around as fast as we could. Fortunately, the muscle memory from our rapid development cycles prepared us to make a dramatic adjustment to our approach. Given the level of mud in a typical coal mine and challenges it imposes on rolling, we decided (with heavy hearts) to shelve the Rollocopters and focus on wheeled platforms and traditional quadcopters. Even though our robots are just machines, we do build up a sort of relationship with them, coming to know their quirks as we coax them to life. In light of this, the emotional pain of shelving a project can be quite acute. Nevertheless, we recognize that decisions like this are in service of the team’s broader goals, and our hope is that we’ll be able to bring these hybrid aerial-ground vehicles back in a different environment.

We not only had to rework our robot fleet before the SubT Tunnel Circuit, but also had to reassess our test plan: With no coal mines on the West Coast, we instead began scouting for coal mines in West Virginia, which lies in the same geological tract as the competition site. On the advice of one of our interns studying at West Virginia University, we contacted a small tourist mine in Beckley, W.V., called the Beckley Exhibition Coal Mine. We cold-called the mine, explaining (through mild disbelief) that we were from NASA and wanted to cross the country to test our robots in their mine. To our surprise, the town had a longstanding association with NASA. During our reconnaissance visit, the manager of the mine told us the story of local figure Homer Hickham, whose book about becoming a NASA engineer from this humble coal mining town went on to inspire the film October Sky. We were heartily welcomed.

In the month before the Tunnel event, we shipped all our robots to Beckley, where we kept a bruising cadence of day and night testing. By day, we went to locations such as Arch Mine, an active coal mine whose tunnels were 900 feet underground, and the Mine Safety and Health Authority (MSHA) facility, which had indoor mock-ups of mine environments complete with smoke simulators to train rescue personnel. By night, we ran tests in the Beckley tourist mine after the day’s tours were complete. We were working long hours, which demanded both mental and physical endurance: Every excursion involved a ritualistic loading and unloading of dozens of equipment boxes and robots, allowing us to set up shop anywhere with a power outlet. The discipline of practicing our pit crew roles in these settings paid off as the Tunnel Circuit began.

Team CoSTAR at Arch MineTeam CoSTAR and our coal miner partners 900 feet underground at Arch Mine, an active coal mine in West Virginia. The team was testing robots in representative extreme environments to what we expected to find in the Tunnel Circuit event.Photos: Team CoSTAR

As the Tunnel Circuit event began, we noticed on the DARPA live stream that Team Explorer (a partnership between Carnegie Mellon and Oregon State University) was using some kind of device on a tripod at the mine entrance. Googling it, we learned that this was called a total station, a precision instrument normally used for surveying. Impressed at this team’s innovative application of such a tool to an unusual task, we decided to entertain even the most outlandish proposals for improving our performance and began trying to find a total station of our own before our next scored run, which was only two days away. This was a great idea to maximize the localization accuracies along the first ~80 meters of featureless mine entry tunnels, while the robot is still visible from starting location. There were no total station units to be found with a fast enough shipping time online, so we worked the phones to see if there was one we could borrow. Within the next two days, we managed to borrow a device, watch lots of YouTube videos to teach ourselves how to use it, and write and test code to integrate it into our operations workflow and localization algorithms. This was one of the fastest, most fun, and most last-minute efforts in our team during the last two years. Our performance at the Tunnel Circuit led to a second-place finish among some of the best robotics teams in the world.

Preparing for the Urban Circuit

Team CoSTAR Ed TerryCoSTAR member and total station operator, Ed Terry, in a candid moment on the DARPA live stream at the first outing of the total station, following two intense days of on-the-fly integration of this system.Photo: Team CoSTAR

The Tunnel Circuit had shown us how important it was to test in realistic environments, and fortunately, finding test locations for the Urban Circuit-like environments was much easier. With a fully-integrated system and team structure in place, we entered the second year of the SubT Challenge with momentum, which was essential with only five months to adapt to yet another type of environment. We framed our preparation around monthly capability milestone demonstrations, a gated process which allowed us to triage the technologies we should focus on. We took the opportunity to improve the rigor of our techniques for Simultaneous Localization and Mapping (SLAM) and planning under uncertainty, and to upgrade our computing power.

One of the major additions for the Urban Circuit was the introduction of multi-level courses, where the ability to traverse stairways was a prerequisite for accessing large portions of the course. To handle this, we added tracked robots to our fleet. Thanks to the modularity of the NeBula software framework and highly transferable hardware, we were able to go up and down stairs with our tracked robot in four months.

A mere eight weeks before the competition, we struck a partnership with Boston Dynamics to use their Spot legged robot, which arrived at our lab just before Christmas. It seemed too daunting a task to integrate Spot into our team in such a short time. However, for the team members who volunteered to work on it over the Christmas break, the chance to be given the keys to such an advanced robot was a sort of Christmas present! To become part of the robot family, we needed proof that Spot could first integrate with the rest of our concept of operations, NeBula autonomy software, and NeBula autonomy hardware payload. Verifying these in the first two weeks, we were convinced that it was fit for the task. The team systematically added NeBula’s autonomy, perception, and communications modules over a matter of weeks. Boasting a payload capacity of up to 12 kg, we were able to equip Spot with the high levels of autonomy and situational awareness that allowed us to fully add it to our robot fleet only two weeks prior the competition.

Team CoSTAR used a Spot NebulaSpots equipped with the NeBula autonomy and perception payload.Photos: Team CoSTAR

As we pushed Spot to traverse extreme terrains, we attached it to an elaborate rope system devised to save our precious robot when it fell. This was a precautionary measure to help us learn Spot’s limits with its unique payload configuration. After several weeks of refining our procedures for reliable stair climbing and building up confidence in the robot’s autonomy performance, we did away with the tether just one week before the competition.

Spot on a leashA tethered Spot preparing for stair climbing trials.Photo: Team CoSTAR

Our robots go to school

Shortly after the Urban Circuit competition location was revealed to be in the small town of Elma, Washington, we emailed Elma High School asking if they were open to NASA testing its robots in their buildings. In a follow-up phone call, a teacher reported that they thought this original email was a scam! After providing some more context for our request, they enthusiastically agreed to host us. In this way, we were able to not only test multi-level autonomy in complex building layouts but also to give the high-school students an inside look at a NASA JPL test campaign.
Each evening, after the students had left, we shifted our equipment and robots from our base in a hotel conference center to the school, and set up our command post in the cafeteria. The warm, clean, and well-lit school was a luxury compared to earlier field test settings in mines deep underground. Each night, we sought to cover more of the school’s complex layout: hallways, classrooms, and multiple sets of stairs. These mock runs taught us as much about the behavior of the robot team as it did about the human team, especially as everyone found ways of dealing with sustained fatigue. We typically kept practicing in the school until well after midnight, thanks to the flexibility and generosity of the staff. At one stage, we were concerned that tethering our legged robots up to the stairs would chip their paintwork but they said, “Don't worry about it, we need to repaint it sometime anyway!" We would periodically have visitors from the school, our hotel, and even local restaurants, whose encouragement kept our spirits high despite the long hours.

Team CoSTAR birthday partyA birthday of one of CoSTAR team members during the competition week at our testing site.Photo: Team CoSTAR

Our first SubT Urban Circuit run was scheduled for the second day of the competition, which gave us a chance to watch the first day of the DARPA live stream. We noticed a down staircase right next to the starting gate of the Alpha course. One team member mentioned offhandedly that evening that we should try throwing a communications node into the staircase as a low-risk way of expanding our communications range. Minutes later, we started making phone calls to our hosts at Elma High School. The following morning at 7 a.m., one of the Elma school teachers arrived with a box full of basketballs and volleyballs. With these raw materials, we set about making a protective shield for the communications node to help it survive bouncing down several flights of stairs. One group started chipping away at the foam volleyballs while another set about taping together basketballs into a tetrahedron.

By 9 a.m., we had produced a hollowed-out foam volleyball with a communications node embedded in it, wrapped with a rope tether. For the first (and last) time in our team’s history, we assigned a job based on athletic ability. We chose well, and our node-in-a-ball thrower stood outside of the course and launched the node cleanly over the stairway bannister, allowing us to then gently lower it down on the tether. In the end, we didn’t need the extra range provided by the node-in-a-ball as our robots were able to come back into the communication range at the bottom of the staircase without any help. 

Team CoSTAR node-in-ball communication extenderOur node-in-a-ball in action: To expand our robot’s communications range, we threw a communications node embedded in a hollowed-out foam volleyball down a staircase.Image: Team CoSTAR

Over a 60-minute scored run, only one human supervisor stationed outside the course can see information from within the course, and only if and when a communication link is established. In addition, a pit crew of up to nine people may assist in running checklists and deploying robots prior to the start of the mission. As soon as the robots enter the course itself, the team must trust that the hardware and autonomy software is sound while remaining ready to respond to inevitable anomalies. In this respect, the group starts to resemble an elite sports team, running a to-the-minute routine.

With holes in the floor, rubble piles, and water slicks, the Urban course put our robots through their paces. As the robots moved deeper into the course and out of communications range with the human supervisor, all we could do was rely on the robots’ autonomy. On the first day, the team was startled by repeated banging and crashing noises from within the course. With an unknown number of staircases, we feared the worst: That a wheeled rover had driven itself over the edge. To our relief, the sound was just from small wooden obstacles that the robot was casually driving over. 

Our days were structured around preparing for either test runs or scored runs, followed by a post-run debrief and then many hours poring over gigabytes of collected data and making bug fixes. We cycled through the pizza-subs-burgers trifecta multiple times, which spanned the culinary options available in Elma. Before beginning a run, we ran a “smoke test” of each robot in which we drove it 2 meters autonomously to verify that every part of the pipeline was still functional. We had checklists for everything, including a checklist item to pack the checklist itself and even to make sure the base station supervisor was in the car with us. These strict procedures helped guard against mistakes, which became more likely the longer we worked.

Every run revealed unexpected edge cases for mobility and autonomy that we had to rapidly address each night back at the hotel. We split the hotel conference center into a development zone and a testing zone. In the latter, we installed a test course configuration that would rotate on a daily basis, depending on what was the most pressing issue to solve. The terrain on the real course was extremely challenging, even for legged robots. In each of the first two scored runs, we lost one of our Spot robots to various negative obstacles such as holes in the ground. In a matter of hours after each run, the hardware team built reconfigurable barriers and a wooden stage with variable-size negative obstacles to test the resiliency of obstacle detection and avoidance strategies. After implementing these fixes, we transported the robots to the hotel to organize and prepare our fleet, which stoked the curiosity of fellow guests.

And the winner is…

Going into the final day of the competition, we were tied with Team Explorer. All of the parameter tuning, debugging, and exploration strategy refinements came together in time for the last round. Capping off a 1.5-year effort, we sent our robots into the SubT Urban Course for the final time. The wheeled Huskies led the way to build a communications backbone and explore the ground floor, with the legged Spots following behind to take the stairs to other levels. 

To score even a single point, a chain of events needs to happen flawlessly. Firstly, a robot needs to have covered enough space, traversing mobility-stressing and perceptually-degraded course elements, to reach an area that has an artifact. Multiple camera video streams as well as non-visual sensors are analyzed by the NeBula machine learning framework running on the robot to detect these artifacts. Once detected, an artifact’s location must be estimated to within 5 meters of the true location defined by DARPA with respect to a calibration target at the course entrance. Finally, the robot needs to bring itself back into communication range to report the artifact location within the 60-minute window of mission duration. A critical part of accomplishing the mission in this scenario is a decision-making module that can take into account the remaining mission time, predictive mission risk, as well as chances of losing an asset, re-establishing communication, and retrieving the data. It’s a delicate balance between spending time exploring to find as many artifacts as possible, and making sure that artifact locations can be returned to base before time runs out.

With only 40 report submissions allowed for 20 placed artifacts, our strategy was to collect as much information as possible before submitting artifact reports. This approach of maximizing the autonomous coverage of the space meant that a substantial amount of time could go by without hearing from a robot that may be out of communications range. This made for a tense dynamic as the clock ticked down. With only 15 minutes to go in the last run, we had scored just 2 points, which would have been our lowest score of the entire competition. It didn’t make sense: We had covered more ground than in all prior runs, but without the points to show for it. We were praying that the robots would prevail and come back into communication range before the clock ran out. Within the final 15 minutes, the robots started to show up one by one, delivering their locations of the artifacts they’d found. Submitting these incoming reports, our score increased rapidly to 9, turning the mood from despairing to jubilant, as we posted our best score yet.

Team CoSTAR celebrates victoryOnly 15 minutes to the end of the mission, our autonomous robots returned to our communication range to deliver the scored artifacts, turning the mood from despairing to jubilant.Image: Team CoSTAR

As the pit crew emerged from the course to meet the above-ground team, there was a flurry of breathless communication and in the confusion it did allow for one small prank. One of the pit crew members took our team lead aside and successfully convinced him and the above-ground team, for a minute or two prior to the formal announcement, that we had only scored two points! At the same time, we were being ushered over to a pop-up TV studio where we gathered before the camera for the final scores to be revealed. The scores flashed up on the screen showing us scoring 9 points and in first place. The surprised face of our pranked team members was priceless! For the entire team, the exhaustion, frustration, and dedication that we had given to the task dissolved in a moment of elation.

WinThe team reacts to the final scores being revealed.Image: Team CoSTAR

While there is a healthy spirit of competition among the teams, we recognize that this challenge remains an unsolved problem and that we as a robotics community are collectively redefining the state of the art. In addition to the scored runs, we appreciated the opportunity to learn from the extraordinary variety of solutions on display from other teams. Both the formal knowledge exchange and the common experience of taking on the SubT Urban course enhanced the feeling of shared advancement.

Team CoSAR T-rex dinosaurLeft: CoSTAR T-Rex, played by our field test lead, John Mayo, meets the team right after the final scored run; right: DARPA award ceremony.Photos: Team CoSTAR

Post-competition and COVID-19

After the Urban competition, the COVID-19 pandemic set in and JPL shifted part of its focus and resources towards pandemic-related research, producing the VITAL respirator in 37 days. As our robot fleet served us faithfully during this competition, they earned some time to recuperate (with proper PPE), but they will soon be pressed into service once more. We are in the process of equipping them with UV lights to sterilize hospitals and the JPL campus, which reinforces the growing role robots are playing in applications where no human should venture.

Spot from Boston DynamicsCoSTAR robots recuperating with proper PPEs.Photo: Team CoSTAR

While the DARPA Cave Circuit in-person competition is another victim of COVID-19 restrictions, the team is continuing to prepare for this new environment. Supported by NASA Science Mission Directorate (SMD) the team focuses on searching for biological signs and resources in Martian-analog Lava tubes in Northern California. On a parallel track, our team is leveraging these capabilities to form mission concepts and autonomy solutions for lunar exploration to support the vision of NASA’s Artemis program. This will in turn help refine our traversability, navigation, and autonomy solutions for the tough environments to be found in the final round of the DARPA Subterranean Challenge in late 2021.

Spot exploring cavePicture of robot in Martian-analog extreme terrains and lava tubes. Tests conducted in Lava Bed National Monument, Tulelake, Calif.Image: Team CoSTAR

Edward Terry is a robotics engineer and CoSTAR team member. He studied aeronautical engineering at the University of Sydney and completed the Master of Science in Robotic Systems Development at Carnegie Mellon University. In Team CoSTAR, his focus is on object detection and localization under perceptually-degraded conditions.

Fadhil Ginting is a robotics visiting student researcher at NASA’s Jet Propulsion Laboratory. He completed his master’s in robotics, system, and control at ETH Zurich. In Team CoSTAR, his focus is on learning and decision making for autonomous multi-robot systems.

Ali Agha is a principle investigator and research technologist at NASA’s Jet Propulsion Laboratory. His research centers on autonomy for robotic systems and spacecrafts, with a dual focus on planetary exploration and terrestrial applications. At JPL, he leads TEAM CoSTAR. Previously, he was with Qualcomm Research, leading the perception efforts for autonomous drones and robots. Prior to that, Dr. Agha was a postdoctoral researcher at MIT. Dr. Agha was named NASA NIAC fellow in 2018.

The Conversation (0)