After wrapping up the DARPA Robotics Challenge in 2015, Gill Pratt helped to launch the Toyota Research Institute (TRI), which is investing over a billion dollars in robotics and artificial intelligence over the next five years. As you might expect, a major focus of TRI is automotive autonomy: Toyota is just as interested as any automotive manufacturer at using autonomous systems to make cars safer, more efficient, and more pleasant to drive.
At Toyota’s CES press conference earlier this month, Pratt took the stage to address some of the challenges facing anyone working on automotive autonomy. There are many of these, and frequently, the amount of progress that the industry is making towards full autonomy is misunderstood, or even occasionally misrepresented. With that in mind, he spent a solid 20 minutes giving the audience a much needed reality check.
Gill Pratt on . . .
- No One Is Close to Achieving True Level 5 Autonomy
- The Over-Trusting Autonomy Paradox
- If Self-Driving Cars Need to Be Better Than Human Drivers, How Good Is Good Enough?
- Machine Learning and “The Car Can Explain” Project
- Simulating the Crazy Things Human Drivers Do
- “We Actually Need a Revolution in Computer Hardware Design”
- Human and Machine: Who Should Guard Whom?
IEEE Spectrum: At the Toyota press conference, you said that Level 5 autonomy—when a car can drive completely autonomously in any traffic or weather condition—is “a wonderful goal but none of us in the automobile or IT industries are close to achieving true Level 5 autonomy.” This definitely isn’t what it seems like when you walk around the automotive hall at CES, not to mention the host of self-driving demos we’ve seen recently.
Gill Pratt: The most important thing to understand is that not all miles are the same. Most miles that we drive are very easy, and we can drive them while daydreaming or thinking about something else or having a conversation. But some miles are really, really hard, and so it’s those difficult miles that we should be looking at: How often do those show up, and can you ensure on a given route that the car will actually be able to handle the whole route without any problem at all? Level 5 autonomy says all miles will be handled by the car in an autonomous mode without any need for human intervention at all, ever.
So if we’re talking to a company that says, “We can do full autonomy in this pre-mapped area and we’ve mapped almost every area,” that’s not Level 5.
That’s Level 4. And I wouldn’t even stop there: I would ask, “Is that at all times of the day, is it in all weather, is it in all traffic?” And then what you’ll usually find is a little bit of hedging on that too. The trouble with this Level 4 thing, or the “full autonomy” phrase, is that it covers a very wide spectrum of possible competencies. It covers “my car can run fully autonomously in a dedicated lane that has no other traffic,” which isn’t very different from a train on a set of rails, to “I can drive in Rome in the middle of the worst traffic they ever have there, while it’s raining," which is quite hard.
Because the “full autonomy” phrase can mean such a wide range of things, you really have to ask the question, “What do you really mean, what are the actual circumstances?” And usually you’ll find that it’s geofenced for area, it may be restricted by how much traffic it can handle, for the weather, the time of day, things like that. So that’s the elaboration of why we’re not even close.
And all this creates an expectation problem for consumers because they’re hearing all of this stuff and they don’t know what it means.
Right. And that’s particularly true because, as consumers, we have a tendency to try to build a model in our brains to predict how good the autonomy is going to be. That model tends to be an anthropomorphic one: We focus on certain things, and we don’t see when the autonomy doesn’t handle certain other things, and we tend to either over-trust or under-trust how good the autonomy is.
[Toyota’s] view is that consumer education is absolutely important to truly understand the limitations of the technology. It’s very important that the industry as a whole strive to put themselves in the shoes of the customer and say, “How do we make sure they really understand how good this is, what it can do, but most importantly, what it can’t do,” so people don’t over-trust their vehicles.
That over-trusting is another big problem, because once you give the car to a consumer and the consumer experiences some of what it can do in a favorable environment...
It’s really difficult. As human beings, we tend to make predictions, that’s sort of how our brains are wired. Things like, “The weather was good today, the weather was good the day before, so therefore the weather is going to be good forever.” As we improve the technology for autonomous cars, they’re going to require human intervention less and less. And this will exacerbate the over-trust problem, because the tendency will be that people will say, “It hasn’t needed me before, it doesn’t need me now, so it will never need me.”
In some ways, the worst case is a car that will need driver intervention once every 200,000 miles. And so an ordinary person who has a car every 100,000 miles would never see it. But every once in a while, maybe once for every two cars that I own, there would be that one time where it suddenly goes “beep beep beep, now it’s your turn!” And the person, typically having not seen this for years and years, would completely over-trust the car and not be prepared when that happened. So we worry about this paradox: The better we do, the worse the over-trust problem could become.
Do you think full Level 5 autonomy in that context is realistic, or even possible?
I think that it is possible, and our standard is some multiple of how good human beings are at the task of driving. I asked this question recently [at Toyota’s CES press conference], which I think is extremely important, and which society has not answered yet: What standards should we use, and how good does the system need to be?
It will be many years before autonomous cars are perfect, and so some crashes are likely to happen. Are we as a society ready to accept 10 percent better than human driving? One percent better than human driving? Ten times better than human driving? I don’t know, and to be honest, it’s not our role as technologists to tell society what the answer is. It’s the government and everybody else’s role who will be affected by this to weigh in and say, “Well, maybe it’s good enough to save one life.” Or should we actually say, “We’re not going to use it until it’s 10 times better.” I’m not sure, but I think until we have an answer to that question, we have to be very careful to not introduce technology that ends up out of line with what society’s expectations are.
In the context of autonomous cars and safety, I’ve tried to make the point at times—and others have as well—that if autonomous cars are even just 1 percent safer than humans, we should still be using them.
From a rational point of view, you’re exactly right, but it’s an emotional thing, right? People aren’t rational: If you look at plane crashes as an analogy, planes are without a doubt, by far, the safest way to travel, per mile. You should never worry that you’re going to have a crash in a plane. But plane crashes do happen, and when they happen, we have this sort of exaggerated response, where we become worried and afraid that our plane is going to crash. Rationally, it makes no sense at all: it’s much more likely that we’ll have a crash in a car than we will have in a plane, and yet it’s plane crashes that are in the news all the time, so we end up fearing the wrong things.
For an accident in a human-driven car, it’s a situation where we can put ourselves in the shoes of the driver of that car and say, “That could have happened to me. I could have made that mistake.” If it’s a machine, I worry that the empathy won’t be there, and that people will expect the machine to be perfect.“AI systems are not perfect . . . Every once in a while, as we progress this technology, there will be cases where the perception system will get it wrong, and that may lead to an accident. What are we going to say when that happens?”
And yet, as we know, AI systems, especially ones that are based on machine learning, are not perfect. The space of possible inputs to the sensors is so large, the car will see things it’s never been trained on before, and we’re expecting it to come out with a reasonable perception of the world based on what it sees. Every once in a while, as we progress this technology, there will be cases where the perception system will get it wrong, and that may lead to an accident. And what are we going to say when that happens? Who are we going to blame? We don’t know the answer, but we know it’s a very important question.
James Kuffner [who helped develop the Google self-driving car and is now CTO at TRI] was talking about this on a CES robotics panel in the context of cloud robotics... The fact that autonomous cars are going to get into accidents is a certainty, but is it the case that every time one gets into an accident, that the manufacturer will be able to tell what caused it, and then update the software in all of their other cars so that we won’t have that kind of accident again?
I think it’s very likely, in fact, I would be surprised if it’s not the case, that we will have very precise logs of what happens in an autonomous car accident. Now, you asked a very interesting question: Will we be able to figure out why?
What’s interesting about this—because machine learning systems, and deep learning in particular, are very high performance but don’t actually explain how they figure out the answer—is that getting the car to explain what it does is actually hard. We have research that we’re doing at MIT, and other places, where we’re trying to make progress on that issue: the one that I can point you to is Professor Gerald Sussman at MIT, and the project that he’s working on with our funding is called “The Car Can Explain.”“Globally, cars travel around 10 trillion miles. So if you’re testing millions of miles, which car manufacturers are doing on physical cars, that doesn’t get you anywhere near close coverage of possible things that could happen to all cars . . . You have to boost it in some way, and doing accelerated testing simulation is a key part to that.
So the logs will be there, but what’s responsible for the mistake? That’s a much harder question. Will we do something to make sure that that mistake doesn’t happen again? Of course we will, but the dimensionality of the input space immense. And so saying, “Okay, I’ve plugged this leak in the dam, I’ve plugged that leak in the dam…” It’s a really big dam, and there are lot of places that things could go wrong. This actually brings us to the biggest part of our work: It turns out that testing is the number one thing that we need to do, and this is that whole trillion mile challenge that I talked about last year. Globally, cars travel around—within an order of magnitude—10 trillion miles. So if you’re testing millions of miles, which car manufacturers, including us, are doing on physical cars, that doesn’t get you anywhere near close coverage of possible things that could happen to all cars in the world in the course of a year. You have to boost it in some way, and doing accelerated testing simulation is a key part to that. We’re not going to simulate the “Sunday drive,” when the weather’s nice and everything’s great and there’s hardly any cars on the road. We’re going to simulate when things are terrible.
Rod Brooks has this wonderful phrase: “Simulations are doomed to succeed.” We’re very well aware of that trap, and so we do a lot of work with our physical testing to validate that the simulator will be good. But inevitably, we’re pushing the simulator into regimes of operation that we don’t do with cars very often, which is where there are crashes, where there are near misses, where all kinds of high-speed things happen. For example, an autonomous car has to be prepared for the bad actor, the sort of byzantine case where there’s another car, perhaps a human-driven car, that’s driven by a person with road rage, and your car has to do the right thing, even though the other car was bad. And so, what’s the answer there? We’re not going to test that very often in the physical world, because half the time, they’re going to crash, so we’ll test it once in a while, but we need to boost it with simulation and that’s quite hard to do.
And that finally brings up yet another area, which is, can formal methods help? There’s this tremendous promise of formal methods in programming, to say that we can prove correct operation within this regime of the state space. Presently, it’s very hard to do, so we’re looking at hybrid approaches of simulation and formal methods, blended together to try to get at the dimensionality of the problem. But fundamentally a thing your readers should know is that this really comes down to testing. Deep learning is wonderful, but deep learning doesn’t guarantee that over the entire space of possible inputs, the behavior will be correct. Ensuring that that’s true is very, very hard to do.
Especially right now, with autonomous cars or any car with any aspect of autonomy, I would imagine that humans are the least predictable thing on the road. Going through on-road testing, a lot of weird human-related stuff has to come up, but how do you even try to simulate all of the crazy things humans might do?
We try to model human beings in the same way we model weather or we model traffic. It’s hard to model human beings, because we have 10 to the 10 neurons in our brains: Every person’s different, and there’s a whole wide range of behaviors, but we know that to some extent it’s possible. As a human driver, we have theory of mind, of how other human drivers act. We run a little simulation in our head, like when we’re at a four-way stop, saying if I were that person, I’d act like this, so I’m going to do this. Theory of mind is this amazing thing that means that making a simulation is possible, because we can build statistical models to predict what other human beings are going to do.
Sometimes, though, there’s a difference between how a human being is likely to act, and what the safe (or legal) way of acting is. How do you teach a car what to do in circumstances like that?
It’s an interesting question. What speed should you travel on the highway? If the speed limit is posted at 55 miles an hour, should you go 55, or should you go the median speed of what the other cars are doing? What’s the safest thing to do? I don’t want to give an official answer, but it’s a conundrum, and we have discussed this with the legal department at Toyota. They scratch their heads and say, “That’s really tough.” So what do you do? And who’s responsible, right? It may be the case that driving slower isn’t the safest. How do we resolve that issue? It’s very hard to figure out.
Right after the press conference, you mentioned something about how cooling and powering the computers in electric cars is actually a big problem. We’re very focused on the challenges involving decision making, but what other things do we still have to figure out?
What I love about this stuff is that I have this whole side of me that’s the hardware guy, but I also did a lot of my own research in the past in neurophysiology. This business of computational efficiency is really important. Our brains use around 50 watts of power, while the autonomy systems for most [autonomous and semi-autonomous cars] in the field, not all of them, but most of them, use thousands of watts of power, and that’s a tremendous difference. And it’s not even that our brain is using all 50 watts to do the driving task: we think about all kinds of other things at the same time, and so perhaps 10 watts is going into the driving task.“The idea that only the chauffeur mode of autonomy, where the car drives for you, that that’s the only way to make the car safer and to save lives, that’s just false . . . There are tremendous numbers of ways to support a human driver.”
We also don’t know that the amount of computational power that’s in current prototype autonomous cars is the appropriate level. It may be that if we scaled the computation by a factor of 10, that the car would perhaps not get 10 times better, but would get significantly better than what it does now. Or what would happen if we had 100 times more computational power? It may be that it actually doesn’t plateau, but continues to get better. I suspect the answer is that it does keep on growing, although I don’t know what the curve is. And so in addition to all the software stuff and all the testing we need to do, we actually need a revolution in computer hardware design: We need to somehow get more towards the power efficiency that the brain has.
To answer your question of other things [that remain to be solved], I think sensors still have a long way to go. Lidar is an amazing thing, but it’s still not that great. The point density is still low, and for looking at cars in the distance, it doesn’t compare with human vision. Particularly, for instance, with Level 3 autonomy, the amount of time you need in order to wake up the driver to say, “Hey, you’re going to need to deal with this thing that’s just happening in 1500 feet in front of you,” that’s not going to happen unless your sensor can see and understand what’s happening that far away.
The sensor also needs to be cheap. It needs to be vibration resistant. It needs to last for 10 years or maybe 20 years and not break. Automotive quality, which most people think of as being low, is actually very, very high. I’ve worked a lot, in my DARPA work, with mil-spec and things like that, but automotive quality is ridiculously hard to do. And so a cellphone camera is not going to work—you’re going to need special stuff that really can withstand the desert and Alaska and going back and forth. How do you deal with salt and rust, with the screen getting full of dust from the road… Making sure that’s going to work, all of the time, that’s really hard.
So to sum this thing up, I think there’s a general desire from the technical people in this field to have both the press and particularly the public better educated about what’s really going on. It’s very easy to get misunderstandings based on words like or phrases like “full autonomy.” What does full actually mean? This actually matters a lot: The idea that only the chauffeur mode of autonomy, where the car drives for you, that that’s the only way to make the car safer and to save lives, that’s just false. And it’s important to not say, “We want to save lives therefore we have to have driverless cars.” In particular, there are tremendous numbers of ways to support a human driver and to give them a kind of blunder prevention device which sits there, inactive most of the time, and every once in a while, will first warn and then, if necessary, intervene and take control. The system doesn’t need to be competent at everything all of the time. It needs to only handle the worst cases.
And often the worst cases for human drivers are things that robots are particularly good at right now, like paying attention all the time to the car in front of you, or watching where the lines are, right?
Correct. People and machines have complementary skills: machines are very good at staying vigilant. People are very good at handling difficult driving situations with theory of mind and all that kind of stuff, and machines are particularly bad at that. So how can we put the two together in a way that they complement each other, rather than sort of fight each other? Another way of thinking about this is who guards whom, right? Should the human being guard the autonomy? Should the autonomy guard the human being? It’s a higher threshold to make this system where the human being has to guard the machine and eventually doesn’t have to guard anything at all to get the AI to work, so that’s why at Toyota we’re taking a dual approach to things.
Most companies at CES are here to sell. At Toyota, we don’t have any trouble selling cars, so I took the opportunity to use this to try to educate. To say, “Look, this is all great, but let’s make sure we’re all talking about the same thing.” If we could define the capabilities, take the SAE scale [of driving automation] to heart. But even within the SAE scale, there’s so much confusion.
The NHTSA had levels 1 to 4, right?
Yes, so NHTSA had 1 to 4, and SAE was very smart, they said we’ll take Level 4 and we’ll split it into two things, 4 and 5, and the difference is that Level 5 is everywhere at any time, and 4 is only some places at some times. But otherwise they’re basically the same. The SAE levels are fine, but people keep making mistakes. And so Level 2 systems are often called Level 3, which is wrong. A key thing of Level 3 is that the person does not need to supervise the autonomy. So no guarding of the machine by the human being.
And so when the driver can’t take over, what’s the Plan B? I remember when Freightliner introduced its autonomous truck, its Plan B was that the truck would just stop in the middle of the highway because it doesn’t know what else to do.
Buried inside of SAE stuff, it does talk about some Plan B things where the car will pull off the road if it has to stop. I love that, if only it could be true all the time. I grew up in New Jersey. There are a lot of highways in New Jersey where there’s no place to pull over, and so what are you going do? What if you’re in the tunnel and there’s no side lane, no pull-off lane? It has to handle every case, and that’s what so hard about this thing.
There have been proposals for things like a remote driver, where if the car gets into difficulty, it’ll flip over to a remote driver in a call center somewhere. First of all, those things assume the Internet is working, there are no hackers, there’s no natural disasters. But most of all, it’s the same issue we discussed earlier, that you assume a person at a call center can suddenly wake up and say, “Okay, trouble,” and handle the situation. People are very bad at that. We’re very bad at suddenly being surprised and needing to orient ourselves and coming up with a correct response. So we’re not saying that that’s impossible, but we’re saying it may be very difficult, for the reasons that we’ve outlined.