Custom Processor Speeds Up Robot Motion Planning by Factor of 1,000

A preprogrammed FPGA can take motion planning from frustrating to instantaneous

Jaco robot arm doing motion planning with custom FPGA processor
Image: Duke Robotics
Advertisement

If you’ve ever seen a live robot manipulation demo, you’ve almost certainly noticed that the robot probably spends a lot of time looking like it’s not doing anything. It’s tempting to say that the robot is “thinking” when this happens, and that might even be mostly correct: odds are that you’re watching some poor motion-planning algorithm try and figure out how to get the robot’s arm and gripper to do what it’s supposed to do without running into anything. This motion planning process is both one of the most important skills a robot can have (since it’s necessary for robots to “do stuff”), and also one of the most time and processor intensive. 

At the RSS 2016 conference this week, researchers from the Duke Robotics group at Duke University in Durham, N.C., are presenting a paper about “Robot Motion Planning on a Chip,” in which they describe how they can speed up motion planning by three orders of magnitude while using 20 times less power. How? Rather than using general purpose CPUs and GPUs, they instead developed a custom processor that can run collision checking across an entire 3D grid all at once.

As someone who’s often afflicted by robot demos, my first reaction on seeing something like this would be to assume that all of these trajectories were pre-planned. They’re not, of course: the motion planning is being performed in real time, in about a millisecond. Typical planning time for a task like this might be several hundred milliseconds (or occasionally up to a second), so the speed increase here is massive.

To understand what’s going on here, take a look at the figure below from the paper, which illustrates a probabilistic roadmap, or PRM. A PRM is a graph of connections between all the points in a 2D or 3D space that don’t involve going through an obstacle. To motion plan, all you have to do is put a start position and an end position into the PRM, and figure out the most efficient path to follow through the lines of the graph (the “edges” of the PRM) to get from one to the other:

In practice, however, you also have to consider the fact that you’re dealing with a physical robot arm, and that when the arm’s gripper (the bit you care about) moves from one place to another, the rest of the arm (which you don’t care about as much) also has to move, and may run into things that aren’t anywhere near the gripper. The area that a robot arm moves through is called the “swept volume,” and it looks like the image below.

This is what motion planning algorithms struggle with most: determining whether a swept volume will result in a collision with an obstacle. The researchers say one study showed that collision detection like this “consumed 99 percent of the compute time of a sample-based motion planner,” because it requires making “a geometric model of the swept volume of each robot movement and [testing] this model against a geometric model of each obstacle.” 

To streamline this process, researchers at Duke used a combination of “aggressive precomputation and massive parallelism.” Before we get to the parallelism (which is where the custom processor comes in), let’s go through the aggressive precomputation, which is arguably just as important. When you first set up your robot, you also generate a single, massive PRM that consists of something like 150,000 edges that represent possible robot motions, while avoiding self-collisions and collisions with permanent features like the table or shelf or whatever is permanently in its workspace. This PRM should contain all of the information that you need to plan any path you want, assuming an empty workspace.

Unfortunately, 150,000 edges is way too many to work with all at once. A more reasonable number is closer to 1,000, so the researchers had to find a way to prune the PRM down:

We also use a scenario distribution: a distribution over problems (start state, end goal, and obstacles). The robot is expected to have to solve problems drawn from the scenario distribution, and the distribution itself expresses the structure inherent in the problem. A scenario distribution is often reasonably easy to describe for real-life applications; for example, any fixed obstacles are present with probability 1, and we expect the robot broadly to have goals that are in front of it and in a certain height range, rather than immediately behind its shoulder joint. We must therefore optimize the PRM to maximize the probability of solving problems drawn from the scenario distribution.

Basically, the researchers are looking at motion planning problems in general and saying, “okay, I’m not going to need to plan for any motion, because my robot is designed to perform tasks within a specific kind of scenario.” So they simulate a bunch of those scenarios (10,000 of them)  with different numbers of randomly located obstacles of variable size, and then check to see which edges in the PRM get used for planning. Edges that are infrequently used for planning get dropped from the PRM, and after a couple of iterations of pruning and retesting, the PRM in the video above was reduced to fewer than 1,000 edges without affecting its ability to solve sample motion planning problems. 

Getting the number of edges in the PRM down to something manageable is important because of this custom processor that takes care of the planning process itself. The processor is what’s called an FPGA, or Field Programmable Gate Array, which is a special kind of computer chip that with a digital logic system inside that you can reconfigure yourself. To do motion planning, the Duke researchers configured an FPGA such that it contained a bunch of collision detection circuits, each one of which corresponds to one of the edges in the PRM. Each circuit accepts as an input the 3D location of a pixel in a depth image, and then outputs a single bit that reflects whether the edge that the circuit represents is in collision with the pixel’s location. If there’s a collision, you remove that edge from your PRM, and once you’ve run all of the pixels in your depth image through the FPGA, you’re left with a PRM consisting only of collision-free paths, and you just pick the shortest one for your robot to take, and that’s it. Motion planned.

The advantage of this system (and it’s an enormous advantage) is that all of those circuits in the FPGA can operate simultaneously, in parallel, on each pixel in your depth image. Throwing more depth pixels at the processor will result in a linear increase in the amount of time that it takes to process the data, but your PRM can have as many edges as you want, and the FPGA will still take just 50 nanoseconds per pixel to resolve all of those potential collisions. The motion-planning task in the video above took a little over 600 microseconds using FPGA-accelerated planning, and a little over 800,000 microseconds using a software-based PRM planner running on a quad-core Intel Xeon processor clocked at 3.5 GHz with 16 GB of RAM. A more complex problem that involved avoiding vertical obstacles took the FPGA-accelerated planner almost exactly the same amount of time, while the software planner took 2,738,000 microseconds. That’s the difference between watching a robot move almost instantaneously and staring at it while it stands motionless for nearly 3 seconds “thinking” before every move.

These are very impressive numbers, but as always, there are a few things to keep in mind. First, the  FPGA-accelerated planning depends on the existence of that first PRM (the one with all the precomputed simulations) that gets programmed into the FPGA, and creating that takes a bunch of time and effort, because it’s specific for each robot configuration. In fact, once you have that PRM, you can’t adjust the physical setup of your robot, or you’ll have to generate a new one from scratch. This also presents an issue if the robot picks up and tries to move an object that’s significantly larger than its gripper, because as far as the PRM is concerned, your robot’s gripper just increased in size, which messes up all the collision stuff. Also, the size of the PRM is constrained by the capacity of the FPGA itself: you can only have as many edges in your PRM as can be programmed into the FPGA, which works out to be a few thousand edges. You can easily solve this by switching to an application-specific integrated circuit (ASIC) instead, but then you lose the flexibility offered by the FPGA.

While the researchers acknowledge that “designing specialized hardware is an unusual step,” they point out that it’s likely worthwhile because motion planning “it is basic to robot movement, and therefore critical to do quickly and well.” These demos are just simple examples of what this hardware is capable of, and the researchers even suggest that motion planning “could be done in real-time in an environment with moving obstacles and/or goals.”

“Robot Motion Planning on a Chip,” by Sean Murray, Will Floyd-Jones, Ying Qi, Daniel Sorin, and George Konidaris from Duke Robotics, was presented this week at RSS 2016 in Ann Arbor, Mich.

Robotics News

Biweekly newsletter on advances and news in robotics, automation, control systems, interviews with leading roboticists, and more.

About the Automaton blog

IEEE Spectrum’s award-winning robotics blog, featuring news, articles, and videos on robots, humanoids, automation, artificial intelligence, and more.