Almost two years ago, a startup called Skydio posted some video of a weird-looking drone autonomously following people as they jogged and biked along paths and around trees. Even without much in the way of detail, this was exciting for three reasons: First, the drone was moving at a useful speed and not crashing into stuff using only onboard sensing and computing, and second, the folks behind Skydio included Adam Bry and Abe Bachrach, who worked on high-speed autonomous flight at MIT before cofounding Project Wing at Google[x] (now just called X).
The third reason we were excited about Skydio’s drone was that, as much as it looked like a research project, it was actually designed to be commercialized, and today, Skydio is (finally!) announcing their first product: the R1, a fully autonomous flying camera. And before you think that you’ve seen flying cameras before, we promise you’ve never seen anything like the R1: as Bry told us two years ago, Skydio’s goal was “to provide a trustworthy and magical experience.” They’ve delivered.
Initially, Skydio sent us a couple different videos to show off the new R1. There’s a video of the drone autonomously taking video of someone playing tennis, along with a video of the drone autonomously following someone running along a track jumping over hurdles. I’ll be honest—those videos got me a little worried about what Skydio had come up with, because they looked like the kinds of videos that other drone companies like to use to show off basic autonomy in situations that are free of aerial complexity. If you’ve seen other autonomous drone demos shot on out on lakes or on ski slopes, you know exactly what I’m talking about—these are ideal environments, without trees or clutter, where drones can perform at their best without being significantly challenged. I was kind of hoping for more magic from Skydio.
The third video from Skydio brings the magic, and more. Hold on to your socks, because this is amazing:
Look at that tree dodging! The bit at 2:32 was particularly incredible, with the R1 deftly maneuvering itself around a small clump of branches. To be clear, this is dynamic vision-based motion, without any pre-existing maps or beacons or anything like that. It’s a level of autonomy that’s way beyond any other consumer drone, and even most of the cutting edge research that we’ve seen.
For another look at the R1’s capabilities, here’s a start-to-finish video from the drone’s perspective as it follows a human jogging and biking down a mountain, weaving in and out of trees as it goes:
And if you want even more, here’s the official launch video.
The system that Skydio uses for autonomous navigation on the R1 is entirely vision-based. There are 12 navigation cameras spaced all around the drone, including cameras that look down and up, and managing this massive inpouring of visual data is 256 cores worth of Nvidia TX1 GPU. The R1 is able to detect and avoid obstacles, and tracks a specific person while it does. It predicts where that person is going to go next, and combines that prediction with a safe trajectory around obstacles while somehow keeping its camera smoothly and consistently tracking the entire time.
This level of autonomy means that the R1 isn’t just hands-off, it’s mind-off. You can control it manually if you really want to (and it’ll help keep you from smashing into things), but it’s designed to be launched and forgotten about—Skydio expects that you’ll learn to trust the drone’s autonomy enough that you can let it loose and then more or less ignore it for the next 16-ish minutes while it films you doing whatever it is you do. I’m not sure that’s something that can be said about any other consumer drone. And then the R1 lands itself, and you have amazing footage, since the drone can capture a variety of different kinds of cinematic video: It can follow you, orbit you, film you from one side or the other, track the action from high above, or even do its best to stay in front of you as you’re moving, which is a neat trick.
The Skydio R1 has 13 cameras spaced all around its body, including cameras that look down and up. To process all the visual data, it uses an Nvidia TX1 GPU with 256 cores.Image: Skydio
Of course, whenever we talk about robots with real-world autonomy, we do our best to find out what their constraints are—what kinds of situations might challenge them or lead to expensive,crash-y problems. We asked Bry about this, and he was very straightforward about the R1’s capabilities:
A decent rule of thumb to use is human visual sensing—if you’re flying around very thin branches, or thin telephone lines, the R1 may not be able to see those. Very large glass surfaces can also be challenging. Other difficult cases are when you have big crowds of people. We try to handle all of these situations as gracefully as we can; in the worst case, the drone stops and notifies you about what’s going on.
Crowds of people can be tricky because the drone can lose you among all the other people who look like people, but Bry says that if you could spot yourself from the drone’s perspective, it has a decent chance of keeping track of where you are. As for obstacles, generally the R1 runs some risk of not detecting things than are smaller than about an inch, but since the detection is visual, it depends heavily on variables like the color of the thing, the color of the background, ambient lighting, and the speed of the drone. It’s hard to give a lower limit to object detection and avoidance with certainty, but Bry tells us that it’s probably safe to assume that the R1 cannot detect power lines, and it also doesn’t know how to handle moving obstacles. “If you throw a ball at it, it’s almost certainly not going to get out of the way,” Bry says. “At some point, we will solve it all, and it will just never hit anything of any size.”
The first batch of Skydio R1 drones will be the Frontier Edition, hand-made from aluminum and carbon fiber (among other things) at Skydio world headquarters and volcano lair in Redwood City, Calif. You can order one today, and it’ll ship to you within the next few weeks. Each drone comes with two batteries good for about 16 minutes of flight time each, along with 64 gigabytes of on-board storage for the 4k gimbaled camera. The price to own arguably the first fully autonomous and intelligent consumer drone in existence is US $2,500. It’s expensive, to be sure, but it’s also the very first in a category that it’s creating, and Skydio tells us that you can expect the R1 to get better at just about everything it does over time.
The Skydio R1 can follow you, orbit you, film you from one side or the other, track the action from high above, or even do its best to stay in front of you as you’re moving.Photo: Skydio
For many, many more details about Skydio and the R1, we spoke with co-founder Adam Bry.
IEEE Spectrum: You started Skydio four years ago. What made you decide back then that it was the right time to develop a consumer drone with this level of autonomy?
Adam Bry: After MIT, we got the opportunity to work at Google for a couple years, on Project Wing. At that time, we were seeing that people were starting to get excited about drones—four or five years ago was when, at least in tech circles, people started thinking about drones as a new technology platform. We felt like there were a lot of exciting concepts and potential commercial applications, and all of them in some form were going to revolve around autonomy to really work. The paradigm with existing products was that you basically needed to fly, so there was a gap between what people wanted to do with drones, and what was possible with existing tech. We knew a lot about the algorithms and the technologies that could enable autonomy, so that was the basic motivation.[shortcode ieee-pullquote quote=""We made a pretty big bet on computer vision as a super powerful way for drones to navigate, because the pace of progress in both computer vision and machine learning-based techniques was fast then, and it's gotten even faster since"" float="right" expand=1]
With our experience as researchers, when we started Skydio we had a decent understanding of what the technology landscape could be, and based on that we made a pretty big bet on computer vision as a super powerful way for drones to navigate, because the pace of progress in both computer vision and machine learning-based techniques was fast then, and it’s gotten even faster since. I think that’s one of the big things that’s made it possible to bring this system to life.
How does the R1 compare to other drones that offer some level of autonomy?
[Autonomy] is a rapidly emerging category. It’s only in the last few years that it’s been possible to build something that navigates out in the real world using onboard sensing and computing. One of the most successful categories so far is robot vacuum cleaners, although it’s still within a very structured environment, with limited mobility and limited motion. Another category that’s parallel to what we’re doing is self-driving cars, and I think there’s a lot of overlap with technology there. But everything that you can actually use today is sort of in the driver assist category, and it’s fully reliant on having a driver who can take over if need be.
[In the drone space], the first thing I would say is that I think there are some really good products out there. But the thing they’ve been really optimized for is the manually flown experience. DJI has been super successful at that; they’ve done an amazing job on it, with all the aspects of manual control and getting the live video stream and recording video. If you look at the people that are using those products, that’s like 99 percent of what they’re doing with it, and it’s awesome. We think of the R1 as a different kind of use case—it’s used in a different way by different kind of people for different things. Autonomy is an emerging theme in this space, and DJI and others are talking about adding on these kinds of features, but so far, it’s kind of a sideshow to the main event. And if you look at their messaging, they position it as a pilot assist kind of thing, and the expectation is that there’s a pilot flying it.
There are a number of tradeoffs that we’ve made throughout the technology stack—in the hardware and sensors that we’ve picked, and in the way the software is built—to go for a fully autonomous experience. I would say that the key threshold [for the R1] is that you don’t have to pay attention to it. You can trust it to fly itself and capture the thing you want to capture, and that enables a different kind of use and will create a different kind of content.
One of Skydio’s early prototypes had a triangle rig with six cameras and used a media center computer. It was in service for the second half of 2015 for software development.Photo: Skydio
Do you think that the level of autonomy you’re focusing on at Skydio represents a difference in philosophy relative to other drone companies like DJI, or a difference in technical capability?
I think it’s a combination of both. There’s certainly a market for manually flown drones; that’s what [DJI] has been successful at, and that’s what they’ve been iterating on. They, and others, understand that autonomy is likely going to be important for certain kinds of things, but maybe haven’t fully committed to it. I would also say that [Skydio] has solved some really hard technical challenges to make the R1 possible. We have world-class researchers from a lot of the top academic labs in the world who are deep, deep experts in all the different ingredients that you need to put together an autonomous system, and if we were a research lab, we’d have a bunch of publications advancing the state-of-the-art in a number of different areas. We’re pushing the state-of-the-art to make this thing possible, and we’ve innovated in ways that other people have tried to, but so far haven’t been able to make work.
The R1 can track people, and anticipate how they’ll move. How does it do that?[shortcode ieee-pullquote quote=""We use deep neural networks to recognize all of the people that it can see, and then for each person that it sees, also build up a unique visual identifier to tell them apart from other people. And then in order for the drone to figure out how it needs to move, it needs to have some prediction of what the person it's tracking is going to do"" float="right" expand=1]
This turns out to be one of the keys to getting the R1 to behave intelligently. We use deep neural networks to recognize all of the people that it can see, and then for each person that it sees, also build up a unique visual identifier to tell them apart from other people. And then in order for the drone to figure out how it needs to move, it needs to have some prediction of what the person it’s tracking is going to do, otherwise it becomes purely reactive and makes very myopic decisions. I can’t get into the details of how this works, but we have a deep learning-based system based on all the people we’ve recorded during our testing, and we use that to make predictions about how somebody might move.
What would you say is the biggest constraint on performance of the R1 right now, and how can we expect its performance to improve over time?
Compute and sensing are certainly a factor, but those things are getting better quickly. I think in a lot of ways, and this is what’s exciting to us, the biggest constraint is our ability to invent new algorithms that solve the problems we’re having and open up new capabilities. The R1 will continually be getting smarter over time; we’ll be shipping software updates frequently that improve performance and add new features. I can’t go into specifics, but the primary navigation functions will get better—smoother movement in more challenging environments, more reliable at dealing with obstacles that are harder to see, more reliable tracking of people and other things.
Another prototype, featuring a pumpkin frame, was in service at the beginning of 2016. It had eight cameras and carried an Nvidia computer on a development board. The lidar was used to test and improve some of the vision systems.Photo: Skydio
To what extent is Skydio focused on building consumer drones like the R1, as opposed to developing autonomous navigation systems for drones in general?
Most of our core technology is in the software and the flight algorithms, but the hardware team has done a phenomenal job, and the kind of thing we’re doing—it’s not like you could just slap a module onto an existing product. The sensors that we’re using, the computer that we’re using, the way that everything is configured and calibrated, the way that we build it, all these things matter a huge amount for getting to the product experience that we want to deliver. And we generally expect that trend to continue: We think there are some super exciting new product concepts to be done in the next few years, and most of them require doing hardware and software well together.
I think for robotics in general, the ability to do both hardware and software is going to be important to make the systems work well. The bio-inspired analogies aren’t always great, but biology has created some incredible autonomous systems, and it’s very much an integrated hardware-software story. We’ve evolved all these intricate mechanical systems which are tightly coupled to our neurological systems and our brains, and I think that’s not an accident—I think that really good robotic systems are going to be designed together, from a hardware and software perspective, to do what they need to do.
The kind of capabilities that the R1 has seem like they’d be necessary for an urban delivery drone, but we haven’t seen any companies doing delivery show autonomy that’s anywhere close. What’s your perspective on the near future of drone delivery?
We think there’s potential [in drone delivery]; it seems likely that a successful urban drone delivery system will grow out of something like what we’re building, where you can accumulate millions of hours of flight experience and validate that the thing is going to work perfectly. The delivery stuff that’s working today, they’re basically trying to avoid the problem that we’re solving, which is great, and makes a lot of sense: People are doing blood delivery in Africa where you can just parachute the payload down, which works in that environment but probably isn’t going to scale to urban or suburban environments. So I think something like what we have is a necessary component for that, we just think it’s a few years away.
[Sense and avoid in complex environments] is a very challenging thing. We’re hoping to prove that it’s possible, but it hasn’t been proven yet, and I think that as people see that it is possible it might change the perspective there a bit.
The first batch of Skydio R1 drones will be the Frontier Edition, hand-made from aluminum and carbon fiber.Photo: Skydio
This is an expensive drone for the consumer space, especially when there are other platforms that offer a relatively superficial degree of autonomy for much less. How will you convince consumers that the R1 is worth the premium?
It’s clearly not at a mainstream mass-market price point, but there are a unique set of capabilities in this product that you really can’t get anywhere else. You can think about it as analogous to a Tesla Model S, where our goal over time certainly is to make this technology more available for more people, and I think a lot of our first customers will be early adopters who are excited about what it can do, and have something particular that they’re excited to do with it that they couldn’t do with anything else.
The R1 is a new kind of thing; I think it’s pretty exciting—there’s this moment when people see it for the first time in real life, where it’s like getting to know another intelligent being, and seeing how it reacts. There haven’t been too many devices like that before.
Skydio is also announcing today the close of $42 million in Series B funding from folks like Playground and Nvidia, bringing their total funding to $70 million. They’re also hiring, and you should check out their promo video if for no other reason than it includes someone wearing one of their prototype drones as a hat while riding a bike.
[ Skydio ]
Evan Ackerman is a senior editor at IEEE Spectrum. Since 2007, he has written over 6,000 articles on robotics and technology. He has a degree in Martian geology and is excellent at playing bagpipes.