Autonomous Security Bots Seek and Destroy Software Bugs in DARPA Cyber Grand Challenge

The mission: to detect and patch as many software flaws as possible. The competitors: seven dueling supercomputers about the size of large vending machines, each emblazoned with a name like Jima or Crspy, and programmed by expert hacker teams to autonomously find and fix malicious bugs.

These seven “Cyber Reasoning Systems” took the stage on Thursday for DARPA’s Cyber Grand Challenge at the Paris Hotel and Conference Center in Las Vegas, Nev. They were competing for a $2 million grand prize in the world’s first fully autonomous “Capture the Flag” tournament. After eight hours of grueling bot-on-bot competition, DARPA declared a system named Mayhem, built by Pittsburgh, Pa.-based ForAllSecure as the unofficial winner. The Mayhem team was led by David Brumley. Xandra, produced by TECHX from GammaTech and the University of Virginia, placed second to earn a $1 million prize; and Mechanical Phish by Shellphish, a student-led team from Santa Barbara, Calif., took third place, worth $750,000.

DARPA is verifying the results and will announce the official positions on Friday. The triumphant bot will then compete against human hackers in a “Capture the Flag” tournament at the annual DEF CON security conference. Though no one expects one of these reasoning systems to win that challenge, it could solve some types of bugs more quickly than human teams.

Darpa hopes the competition will pay off by bringing researchers closer to developing software repair bots that could constantly scan systems for flaws or bugs and patch them much faster and more effectively than human teams can. DARPA says quickly fixing such flaws across billions of lines of code is critically important. It could help to harden infrastructure such as power lines and water treatment plants against cyberattacks, and to protect privacy as more personal devices come online.

But no such system has even been available on the market. Instead, teams of security specialists constantly scan code for potential problems. On average, it takes specialists 312 days to discover a software vulnerability and often months or years to actually fix it, according to DARPA CGC host Hakeem Oluseyi.

“A final goal of all this is scalability,” says Michael Stevenson, Mission Manager for the Deep Red team from Raytheon. “If [the bots] discover something in one part of the network, these are the technologies that can quickly reach out and patch that vulnerability throughout that network.” The original 2005 DARPA Grand Challenge jumpstarted corporate and academic interest in autonomous cars.

This visualization shows network traffic flows for the bot Rubeus as it receives verification of software bugs from competitors.Image: DARPA

The teams were not told what types of defects their systems would encounter in the finale, so their bots had to reverse engineer DARPA’s challenge software, identify potential bugs, run tests to verify those bugs, and then apply patches that wouldn’t cause the software to run slowly or shut down altogether.

To test the limits of these Cyber Reasoning Systems, DARPA planted software bugs that were simplified versions of famous malware such as the Morris worm and the Heartbleed bug. Scores were based on how quickly and effectively the bots deployed patches and verified competitors’ patches, and bots lost points if their patches slowed down the software. “If you fix the bug but it takes 10 hours to run something that should have taken 5 minutes, that's not really useful,” explains Corbin Souffrant, a Raytheon cyber engineer.

Members of the Deep Red team described how their system accomplished this in five basic steps: First, their machine (named Rubeus) used a technique called fuzzing to overload the program with data and cause it to crash. Then, it scanned the crash results to identify potential flaws in the program’s code. Next, it verified these flaws and looked for potential patches in a database of known bugs and appropriate fixes. It chose a patch from this repository and applied it, and then analyzed the results to see if it helped. For each patch, the system used artificial intelligence to compare its solution with the results and determine how it should fix similar patches in the future.

During the live competition, some bugs proved more difficult for the machines to handle than others. Several machines found and patched an SQL Slammer-like vulnerability within 5 minutes, garnering applause. But only two teams managed to repair an imitation crackaddr bug in SendMail. And one bot, Xandra by the TECHx team, found a bug that the organizers hadn’t even intended to create.

Whether humans or machines, it’s always nice to see vanquished competitors exhibit good sportsmanship in the face of a loss. As the night wound down, Mechanical Phish politely congratulated Mayhem on its first place finish over the bots’ Twitter accounts.

internet software telecom cybersecurity ai computing security

Topics

Sections

More

For IEEE Members

For IEEE Members

IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum

Autonomous Security Bots Seek and Destroy Software Bugs in DARPA Cyber Grand Challenge

The team behind the victorious Cyber Reasoning System will receive a US $2 million prize

Why One Man Spent 12 Years Fighting Robocalls

Tiny Biosensor Unlocks the Secrets of Sweat

Startups Say India Is Ideal for Testing Self-Driving Cars

Related Stories

Self-Destructing Circuits and More Security Schemes

Your Tablet’s Light Sensor Can Spy On You

Bellingcat Crowdsources Spycraft, Scales Up Sleuthing

Topics

Sections

More

For IEEE Members

For IEEE Members

IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum

Enjoy more free content and benefits by creating an account

Saving articles to read later requires an IEEE Spectrum account

The Institute content is only available for members

Downloading full PDF issues is exclusive for IEEE Members

Downloading this e-book is exclusive for IEEE Members

Access to Spectrum 's Digital Edition is exclusive for IEEE Members

Following topics is a feature exclusive for IEEE Members

Adding your response to an article requires an IEEE Spectrum account

Create an account to access more content and features on IEEE Spectrum , including the ability to save articles to read later, download Spectrum Collections, and participate in conversations with readers and editors. For more exclusive content and features, consider Joining IEEE .

Join the world’s largest professional organization devoted to engineering and applied sciences and get access to all of Spectrum’s articles, archives, PDF downloads, and other benefits. Learn more →

Join the world’s largest professional organization devoted to engineering and applied sciences and get access to this e-book plus all of IEEE Spectrum’s articles, archives, PDF downloads, and other benefits. Learn more →

Access Thousands of Articles — Completely Free

Create an account and get exclusive content and features: Save articles, download collections, and talk to tech insiders — all free! For full access and benefits, join IEEE as a paying member.

Autonomous Security Bots Seek and Destroy Software Bugs in DARPA Cyber Grand Challenge

The team behind the victorious Cyber Reasoning System will receive a US $2 million prize

Why One Man Spent 12 Years Fighting Robocalls

Tiny Biosensor Unlocks the Secrets of Sweat

Startups Say India Is Ideal for Testing Self-Driving Cars

Related Stories

Self-Destructing Circuits and More Security Schemes

Your Tablet’s Light Sensor Can Spy On You

Bellingcat Crowdsources Spycraft, Scales Up Sleuthing