Last summer, 580 cybersecurity researchers spent 13,000 hours trying to break into a new kind of processor. They all failed.
The hack attack was the first big test in a U.S. Defense Advanced Research Program Agency (DARPA) program called Security Integrated Through Hardware and firmware (SSITH). It's aimed at developing processors that are inherently immune to whole classes of hardware vulnerabilities that can be exploited by malware. (Spectre and Meltdown are among those.)
A total of 10 vulnerabilities were uncovered among the five processors developed for SSITH, but none of those weak points were found in the University of Michigan processor, called Morpheus. Michigan professor of electrical engineering and computer science Todd Austin explained what makes Morpheus so puzzling for hackers to penetrate.
Todd Austin on...
How Morpheus Works
IEEE Spectrum: What is Morpheus, essentially?
Todd Austin: Morpheus is a secure CPU that was designed at the University of Michigan by a group of graduate students and some faculty. It makes the computer into a puzzle that happens to compute. Our idea was that if we could make it really hard to make any exploit work on it, then we wouldn't have to worry about individual exploits. We just would essentially make it so mind bogglingly terrible to understand that the attackers would be discouraged from attacking this particular target. The challenge is, how do you make it mind bogglingly difficult to understand for an attacker, but not affect the normal programmer?
Spectrum: That does sound like a challenge. That sounds impossible, in a way.
Todd Austin: This is where the notion of undefined semantics comes in.
Spectrum: If you could define "undefined semantics" for us, that would be great.
Todd Austin: Think about driving a car: The defined semantics of your car are that it has a steering wheel; it has a left/right blinker; it may have a stick shift depending on the kind of car; it has as an on-off button. Once you know those basic features, you can drive your car. The undefined semantics are: Is it four cylinders or six cylinders? Does it run on diesel or electric? Does it have ABS braking or non-ABS braking? Attackers need to know all that underlying stuff, because they need to use that knowledge to step around the defenses. It is the telltale sign of an attack that it is dipping into the implementation details of a system.
Michigan professor of electrical engineering and computer science Todd Austin.
Spectrum: So you can detect an attack just by looking at what's being looked at?
Todd Austin: Yeah. Let me give you a classic example—the return stack. [When a program "calls" a subroutine into action, it stores the address the program should return to when the subroutine is over in the return stack.] This is a real popular place for attackers to manipulate to do what's called a buffer overflow or a code injection. There's only one part of the program that writes and reads from the return address, and that's the beginning of a function and the end of a function. That's really the semantics of the return address. But when you see a program attacking the return address to do code injection, for example, you'll see writes from many different places in the program. So, these sort of unexpected, undefined accesses, normal programs never do. That's a tell-tale sign that there is an attack going on. Now, some systems might try to detect that. And when they detect that, then they can kill the program. But that's problematic, too, because things like operating systems and device drivers do some of this stuff, too.
So what we do instead is to make the underlying implementation of the machine—the undefined semantics—change every few hundred milliseconds. The underlying implementation will be so unique that you will never see the one that you're on now again, ever, on any other machine in the future. It is completely unique in time and space.
Spectrum: There must be a lot of knobs to turn to be able to do that.
Todd Austin: In the paper we wrote about [the Morpheus concept], we had 504 bits of knobs. In the design that we put into this attack, we had almost 200. And 2200 is a big space to search.
The way we do it is actually very simple: We just encrypt stuff. We take pointers—references to locations in memory—and we encrypt them. That puts 128 bits of randomness in our pointers. Now, if you want to figure out pointers, you've got to solve that problem.
Spectrum: Encrypting the pointer hides the undefined semantics?
Todd Austin: Yeah. When you encrypt a pointer, you change how pointers are represented; you change what the layout of the address space is from the perspective of the attacker; you change what it means to add a value to a pointer.
The key mechanism that's under the hood here is making this machine change and change and change and never be the same ever again. It's cryptography, just simple cryptography.
Spectrum: What is doing the cryptography?
Todd Austin: We use a cipher called Simon. It's not a popular cipher, but it's really fast in hardware. That's why we chose it.
Spectrum: This encryption happens every 100 milliseconds. Why that rate in particular?
Todd Austin: That has to do with a kind of attack called side channels. Side channels are basically the detective work of an attacker. If you try to hide a piece of information—like where a function is kept in memory or what is the value of a particular variable—attackers will basically manipulate the program to see if there's any kind of residue, in terms of the timing of the program or the responsiveness of the program. That reveals information about the secrets that are inside the program. So, if we just randomized—used crypto on pointers and code and stuff—and did it one time, any really skilled attacker would be able to reverse engineer all the information within hours.
So what the re-encryption—we call that churn—does is it places a time limit on that side channeling. Making that time limit faster does two things: It makes the attacker work faster, and it makes the attacker need to be [physically] closer.
Let's say we had a 100-millisecond churn rate, then the attacker would have to be able to basically solve the problem of where things are within 100 milliseconds, minus the communication delay to and from the machine. One of our ultimate goals is to get it under 10 milliseconds, because if you can get under 10 milliseconds, that's the time it takes to get out of the building through the routers. That would require the attacker be inside the building to actually use the information they acquired. Because if they were outside the building, then the information would expire before they could actually use it.
The Sneaky Attacks Morpheus Defends Against
Spectrum: Is this kind of strategy proof against the most infamous side channel attacks, Spectre and Meltdown? Is that something you've actually tried out?
Todd Austin: Yeah, this stops Spectre and Meltdown very easily. This actually stops any attack that relies on pointers and code. And that's a big bunch. I would say it's probably the vast majority of attacks that we talk about.
It only stops low-level attacks. It doesn't stop things like SQL injection and real-high level stuff that happens on your Web server. The Morpheus machine we built for the DARPA challenge is really built to stop what's called a remote code execution, or RCE, attack. These are the crown jewels of vulnerabilities.
There's a website you can go to called Zerodium, that offers bounties for vulnerabilities and then gives the details to their clients. (Whoever they are. I don't know the details.) They have a payout chart, and if you look at anything US $200,000 and above its an RCE attack.
What RCE means is that I can get code onto your machine without you knowing about it. And I don't have to phish you. I don't have to convince you to run a program. I don't have to trick you into running my program. I just inject it to your machine.
Spectrum: So that was what the DARPA-sponsored attacks were?
Todd Austin: Yes. We presented a medical database, and then the attackers had to inject code into the machine.
Spectrum: And none of the attacks go through?
Todd Austin: That's correct. And there were known vulnerabilities in our code, so had our code not been protected with Morpheus, it would have been quite trivial to get in. DARPA encouraged all of us to have known vulnerabilities in our system, because the point of the program was to build hardware that could protect vulnerable software.
RISC-V, Homomorphic Computing, and the Cost of Security
Spectrum: You're doing extra things that an ordinary processor would not have to do. What is the overhead for Morpheus?
Todd Austin: It's about 10 percent slower on average. We could reduce those overheads if we had more time to optimize the system. You know, we're just a couple of grad students and some faculty. If a company like Intel, AMD, or ARM built this, I would expect that you'd get the overheads down to a few percent.
Spectrum: Morpheus was based on a RISC-V processor architecture. How did that influence the project?
Todd Austin: I have to say, we could never have done this without RISC-V and the RISC-V ecosystem. There would have just been no chance. There were mountains and mountains of infrastructure that we could get access to from day one because of RISC-V. I mean, you go to a website, you download a working processor, and then you start adding your stuff to the processor, and you've got compilers and operating systems. Five years ago [before RISC V] this would have been a much bigger, more expensive, and longer project than it was today.
We used what's called the RISC-V Rocket Chip, which is an open-source CPU, and we added our extensions to that design. The LLVM Compiler Infrastructure automatically supports RISC-V, so it was really easy work to extend this environment. It made it all quite approachable.
Spectrum: Is a few percent overhead likely to be worth it?
Todd Austin: That's a billion-dollar question now that I'm doing a startup [Agita Labs]. How much will people pay for security? I'm of the opinion people won't pay much for security. They'll probably litigate for what happened in the past, but they probably won't pay in the future. So I want to make it as cheap as possible.
But I also think that you can create value with security as well. If you have very high security, you can do things that you normally couldn't do.
For example, homomorphic encryption allows you to compute with encrypted data without decrypting it. [Intel, Microsoft, and DARPA recently began a major project to produce a processor capable of efficient homomorphic computing.] Using homomorphic computing, you can give me your genome encrypted and I can compute what disease profile you have and give you back the results. But I can't see your disease profile because the results are encrypted. If security gets to that level, then now there's value added.
The idea that you could have someone process your data without the ability to see it is very interesting. And this is, in fact, where we're taking the Morpheus technology in the startup. We're going to build more of this encrypted data processing, following in the same lane that homomorphic encryption is. It's just a slight change over Morpheus today. It's saying that in Morpheus, the attacker is someone from the outside. In this new version, the attacker is the programmer. So, you just don't trust the programmers.
I think that's a very exciting. From there, you go from security to privacy, where I think of security as insurance and I think of privacy as new capabilities.
Spectrum: But are you planning to commercialize the original Morpheus concept? The one with just the security side of the equation in that example?
Todd Austin: Yes, I mean, it's always a good thing to stop people from hacking into your software. We're working with DARPA to commercialize Morpheus in the cloud, but then we're also continuing down this line of: To what extent can we build computation that people cannot see?
Spectrum: I see. So the idea is that the value increases as you go towards the privacy end?
Todd Austin: Yeah. It's interesting how a lot of people say that Morpheus is unhackable. You see that in the press a lot. I don't generally say that myself, because I think it is hackable. But it's super hard to hack. But I think if we continue in the directions that the SSITH program went, for example, we can build things that software cannot hack. (It may be hackable, but not by software.) We can definitely get there in the future. And I think homomorphic encryption is one manifestation of that.
Unhackable Software and What Comes Next
Spectrum: When do we get to the point where there is nothing that software can hack?
Todd Austin: I don't know. I hope it's soon. We're going to build a product that is going to be incredibly hard to hack. But the difference between a technology and a successful company is can you take that technology and teach people, who probably don't even know they need it, that this [tech] will help them grow their customer base. I hope the answer's soon, but I don't know that for sure yet. We're still working on that.
Spectrum: Is there an immediate next step research wise or in the company?
Todd Austin: The thing we really discovered with Morpheus was this idea that always-encrypted pointers in code really throw a wrench in the attacker's ability to understand what's going on. And we use churn as a mechanism to try and make it even harder. So where we're going from here is we're going to continue to embrace the idea of always-encrypted information to protect it from attackers. In addition to pointers, we're also going to start looking at other data types: strings, integers, floating point values. [And the bottom line is] building computation frameworks around those always-encrypted values and then trying to figure out: "How do we stop people from understanding what is inside that computation?"
That's where we're going. We're expanding the range over which a machine like Morpheus can do encrypted computation on encrypted information. And we'll then try to understand: What is the impact on programming? What kind of challenge does it create for attackers? What new forms of computing can we create there?
I am a strong believer that technology can help ease the tension between privacy and industry's desire to share data for discovery, for monetization, and for understanding their customers better.
I think machine learning is one of the core areas where we could do this. For example, with federated learning, there are entities that have lots of information, and they want to share that information with each other only for the purpose of improving their machine learning models. But they don't want you to have that information. So how do we share information to enhance our machine learning models without losing control of the information we have?
Spectrum: Morpheus was built using a RISC-V CPU architecture as the basis, but lots of machine learning happens on GPUs and even more-specialized architectures. Is there a problem for extending the concept to these other architectures?
Todd Austin: It could go well beyond RISC V, well beyond a CPU. The concepts themselves are independent of the underlying implementation. For example, in the startup, we're not on RISC-V anymore, we're just simply a coprocessor which can be parallel [like a GPU] or it can be serial [like a CPU].
The way we deploy in the cloud is we deploy on a [field programmable gate array (FPGA)]. It was a very fortunate development in the cloud industry that to service high performance, machine learning training, and other very computationally intensive tasks, [companies have deployed] a fairly sizable number of servers in the cloud that have FPGA components on them. Those FPGA components can be reprogrammed to introduce all kinds of hardware. So we simply repurpose those components to be data security processing elements that can allow us to protect the software, and do computation on unseen data, etc. We're definitely benefiting from the way infrastructure in the cloud is changing due to machine learning.
More About Morpheus
Follow this link for a technical article explaining Morpheus.
Todd Austin gave a talk about Morpheus in February 2021 for the University of California at Davis.
- Three Ways to Hack a Printed Circuit Board - IEEE Spectrum ›
- DARPA: Hack Our Hardware - IEEE Spectrum ›
- Darpa Hacks Its Secure Hardware, Fends Off Most Attacks - IEEE ... ›