Spectre and Meltdown

Spectre and Meltdown are two serious security vulnerabilities discovered in nearly every computer processor made since 1995. So what does it mean? Is this the end of computing? Should we throw our computers on the campfire and live among mother nature? I wanted to find out, so I talked to security expert Wu-chang Feng at Portland State University. The verdict? You might just want to break out the abacus, because we’re all in big trouble.

Transcript:

Dustin Driver: Detroit’s hottest DJs, Spectre and Meltdown, live at the Music Institute, Thursday, January 4th, 2018. Get your tickets today at Ticketmeister dot org.

Dustin Driver: Hello and welcome to Let’s Get Mental. I’m your host Dustin Driver. And yes, Spectre and Meltdown are the best dubstep DJ names ever, but unfortunately, they are not dubstep DJs. They are some serious security vulnerabilities that were discovered in every Intel processor made since 1995.

Dustin Driver: What are Spectre and Meltdown, and does this mean the end of computing? Should we all throw our laptops on the campfire and live amongst nature? I don’t know, and that’s why I called security expert Wu-chang Feng of Portland State University. We talked about Spectre and Meltdown, what it means for computing in general in the future, and just a little bit about whether or not you should throw your laptop on the campfire. Without further ado, this is what I found out about Spectre and Meltdown.

Wu-chang Feng: My name is Wu-chang Feng. I’m a professor of computer science at Portland State University, and I work in computer security and computer security education. My particular areas of interest, I do malware analysis and web security, and then I do a lot of outreach to high schools. We run summer camps for high school students. We do internships for high school students to try and get a larger pipeline of students interested in computer security.

Dustin Driver: Let’s start at the beginning. Can you give me a brief rundown, in layman’s terms, of what Meltdown and Spectre are?

Wu-chang Feng: We’ll start with Meltdown. The thing with high performance CPUs is that the latency of executing an instruction is quite long, and because of this really long pipeline, what the CPU will do is try and fetch things in the pipeline, and it’ll try and fetch as many instructions in advance as it can, and will actually partially execute as many as it can in parallel. We need this to get performance. It’s something that every single processor has to do if they want to be high performing, and as part of this, if you’ve written a program, and your program will branch based on one condition or the other, when you’re trying to execute ahead of time and you come to one of these things, you basically have to guess. You have to guess which branch might be the one that’s actually going to be taken. So you pick that branch, and you fetch all those instructions on that branch ahead of time. One of the ways that you can guess accurately is, on all these CPUs, there’s this branch prediction table that they use, and as the program executes, this branch prediction table learns which branch is typically taken at every point in time, and then it’ll use that to guess the execution path of the thing.

Wu-chang Feng: For Meltdown, what happens is that when you are speculatively fetching and executing instructions ahead of time, the memory that you are accessing on the speculative branch is fetched without paying attention to whether or not you have the privileges to actually access that memory. Eventually, when it comes to fully executing that instruction, the processor will be like, hey, you’re not allowed access to this thing, I’m going to create a fault and deny you access, eventually. But the problem is, when you are speculatively executing the instruction that causes the fault, you’ll execute beyond that and then execute additional instructions that give you a way of determining what that fetch actually did. So, if you have a memory access, you can actually infer what that memory was by the side effect of all the speculative executions, and that’s Meltdown, effectively.

Dustin Driver: So, what you have is the processor making educated guesses as to what’s going to happen next in the program that’s running, and it’s guessing ahead in order to get a performance advantage that’s pretty significant, and when it does this it’s actually using some memory, when it’s making these guesses, and that memory that it’s using during the guessing phase isn’t protected, or isn’t locked down. So there’s an opportunity there for some code to gain access to basically everything within the system.

Wu-chang Feng: Yeah, within the operating system. Any memory location in the kernel, but what you’re doing is…between processes, the branch prediction table in the CPU doesn’t get cleared, so when you swap from one process to another process, or one VM to another VM, that state in the CPU is actually kept, that branch prediction table. What you can do, in Adversary, you can craft an attack where they poison the branch prediction table and then they force a flop into the victim process, and the victim process, based on the values that the adversary has put in the branch prediction table, will do certain things that can be measured by the adversary. Again, it’ll be similar to the Meltdown attack, where you can trick the victim process into speculatively executing a crazy branch, a branch that you injected into the branch prediction table, and that’s the side effect that allows you to figure out a memory location on either a different process or a different virtual machine, or any other protection boundary. That’s definitely a hardware problem, and that’s pretty much in every CPU that does branch prediction and speculative execution.

Wu-chang Feng: That’s the one that’s general. The one that’s Meltdown, apparently, is mostly Intel, just because of the way it checks at the access. Apparently, AMD is less susceptible to this. Slower-performing CPUs are less susceptible also. I don’t know if you remember the Atom processor, the stripped-down 386 processor that they put in these Netbooks? They ripped out a bunch of this stuff — a lot of the speculative execution got ripped out, the branch prediction stuff got ripped out. For the low-end stuff, I think the Atom processor is immune to both of these things, because it’s not doing this kind of stuff.

Wu-chang Feng: But definitely on the higher end, where performance really matters, where you’re doing all sorts of crazy out of order executions, speculative executions, branch predictions, it’s definitely in that end.

Dustin Driver: Is this sort of execution taking place when people are using day-to-day programs, or is this when you’re doing something really intensive like video editing or playing a game? When is this taking place most often?

Wu-chang Feng: All the time. It’s happening all the time. It’s baked into the hardware, and it’s how we get the performance we do out of our systems. Even web browsing, even your email client will be doing this, everything will be doing this.

Dustin Driver: There has been a patch released, right? I know that Apple did release a patch to the kernel that went out, and I know Microsoft did as well with Windows. So, at least one of these, and I believe it was Meltdown, there’s a patch in order to safeguard against these kind of attacks.

Wu-chang Feng: So, the way operating systems and programs run is that when you have a program like a web browser running on a system, at least in Linux, what happens is that the kernel memory is mapped into your memory space for every single process that’s running. And they share the same memory space so that when you are fetching in the same address space, even though it’s privileged, it’s still in your memory space, but hopefully the operating system is going to shut down the unauthorized access to its own memory.

Wu-chang Feng: So the fix, and the fix that they pushed out — one of the fixes that I’ve seen is that, when you go from the kernel memory, if you’re executing a system call in the kernel address space, when you go from the system call back to the user, the fix is to unmap all of the kernel memory out of the address space, so that when a user program tries to run Meltdown, that memory is completely unavailable to the user program. But this gives you a performance penalty, like 5 – 15%. At least with the Intel processors, they’re taking a 5 – 15% performance penalty in order for, I believe, this particular fix to be deployed.

Wu-chang Feng: The threat level on both of these is quite low right now, because right now I don’t think anything has been weaponized, at least I haven’t heard of anything being weaponized, and these are really difficult attacks to pull off.

Dustin Driver: From what I understand, it’s not that anyone has been able to take advantage of these, it’s just that they were identified. The patch that was released is preemptive, it’s proactive, and it’s going to protect against any possible attack, or anything like that.

Wu-chang Feng: Yeah, and I think it got a lot of notoriety, because these are one-of-a-kind bug classes. Nobody’s seen this kind of hardware bug causing a bunch of security problems. That’s sort of rare. I think Intel actually is going to build in a hardware fix like what I mentioned earlier with the software fix, which is what most people are deploying for Meltdown. From what I’ve heard, Intel is going to apply a fix that will clear out — so, when you speculatively fetch a memory location and all the sudden you realize, generate a fault, a protection fault — Intel is going to now clear out that register rather than allow that register up into the CPU, which is, I think, where the bug lies. Because once you know that there’s a memory fault, you should clear out all the states to that memory fault, but they hadn’t. I think for the next generation, they’ll put that in there, and then they can undo the software fix that’s causing the 5 – 15% performance penalty.

Dustin Driver: So, right now there’s a patch for Meltdown, and you said that they’re working on a hardware fix for that, for the next generation of CPUs that come out. But Spectre, there is no software patch, so again, is that something that’s going to be fixed with the next generation of CPUs?

Wu-chang Feng: Microsoft has a mitigation. I think there are a couple mitigations to Spectre. The mitigation I’ve heard is, on Windows, you have these notions of security groups, where processes belong in certain security groups based on the level of access they need. High security processes are all grouped together, and then user processes are grouped together. The idea is that — I think this is something they’re trying to push out — when you go from a process in one security group to a process in another security group, the operating system will poison the branch prediction table before passing over to the next process. And I think this will also be a performance penalty until the hardware itself clears the branch prediction table, which probably again will be in the next generation, I hope in the next generation of CPUs. That is a software mitigation, which I think — I don’t know what’s involved to reset the branch prediction table in software. It’s a lot of state that has to get cleared out, so you’d have to do a lot of poisoning of that table, but that’s their idea.

Dustin Driver: Okay. So, for now, we don’t want to frighten anybody. These are vulnerabilities that have been identified. There haven’t been any attacks or even anything engineered, any attacks that have been released. The bottom line is that people’s machines right now are safe with the new patches that have been pushed out, although you’re saying that there could be a performance penalty associated with the patches. That’s a big deal.

Wu-chang Feng: Yeah. Especially if you’re in the Cloud, and you’re paying lots of money for CPUs, you’re potentially going to be paying 5 – 15% more. That’s going to happen. I think all the Cloud providers are going to try and deploy the fixes on all their VM images. This is going to have a performance hit, and you have to budget that out if you’re heavy into the Cloud, which is one of the things that they’re talking about.

Dustin Driver: Right. That’s the main risk, is in a cloud computing situation where you have multiple machines, virtual computers, sharing a single processor or a single hard drive. There’s an opportunity there for one virtual machine to break into another one, so that’s the biggest vulnerability right there.

Wu-chang Feng: Yeah, and I think the Spectre one would be the one that that would be involved with. The Meltdown, as far as I can tell, it depends on the type of hypervisor you’re using, but for most of the cloud hypervisors, I’m under the impression that Meltdown won’t allow a VM escape, but that the actual VM itself, if an adversary is on that VM, they can get basically root access on that VM. But I think with Spectre, the speculation is that they can do a VM escape and go between VMs, and that would be pretty devastating in the Cloud. The whole point of the Cloud is to be co-resident while still maintaining some form of security, and if they can’t guarantee that — that’s almost their entire business model. Their value proposition is going to be gone.

Dustin Driver: Right. And you’re saying now there’s a penalty for one of the patches that will create, basically, an increase in pricing, because you’re going to have to pay more to get the same performance that you did before out of your cloud computing service. But for Spectre, there might not be a patch right now, so that could undermine cloud computing in general? Or is there a way to prevent that?

Wu-chang Feng: In the short term, I think cloud computing does take a little bit of a hit. I would imagine, by deploying this patch, Amazon, Microsoft, and Google are going out and buying a whole bunch more capacity, just because all the people who need computes will need a little bit more because of the slowdown of the virtual machines and the containers that they’re running. And that’s mandatory, because it’s much worse to have a marketing problem where they can’t trust you anymore. I guarantee that they’re buying 15% more capacity right about now and looking to install that capacity as soon as possible, because they almost immediately had to have taken the performance hit on all of this stuff.

Dustin Driver: What does this mean for the speed of processors moving forward? If they’re going to have to be re-engineered, is this a major setback in Moore’s law? Is this significantly going to slow down progress of making faster processors?

Wu-chang Feng: No, this is actually not going to impact — I mean, we will always be speculatively executing and doing branch prediction. For me I think it is a little bit of a bug fix, where I believe micro-architecturally you can easily clear the branch prediction table, the state in the branch prediction table. Fixing the hardware so that it clears out registers that have been loaded up due to a memory fault, I think that one is probably a low-overhead thing. It won’t be the end of speculative execution and branch prediction, it’s just that it’s forced the CPU vendors to take a second look at it. Rumors of this kind of stuff have been around since 2010, but this is the first time it’s actually been shown, it’s been exploited. Now that it’s on the radar of both Intel and AMD and the ARM folks, I think it’ll eventually go away. That’s my contention, but if it opens up a whole new memory-based security attacks, I’d be wrong then.

Dustin Driver: Right. You’re saying they’ve known this, theoretically, that this could happen, since 2010, but it wasn’t proven until very recently.

Wu-chang Feng: Yeah. I think Joanna Rakowska mentioned it, and she tried to get stuff to work and wasn’t able to get a proof of vulnerability. I think in 2015 it came out again, and then it wasn’t until last year that someone was able to pull off these attacks and actually have a proof of content vulnerability.

Dustin Driver: Just to be clear, though, this exploit was done in a lab by a computer scientist, who was intentionally trying to work around this vulnerability?

Wu-chang Feng: Yep.

Dustin Driver: It wasn’t a rogue hacker in his mom’s basement who figured this out. This took years of dedicated research in order to figure out a way in, so to speak.

Wu-chang Feng: Yeah. And it took a really deep knowledge about the underlying processor architecture. Now that it’s out, I would expect people are going to try and weaponize it. Now that the technique is out there, it’s probably the case that someone in the basement is going to try and weaponize it. Stick it in Javascript and then deliver it to your browser to see if it works. That’s the fear. Even when I mentioned, yeah, you do need to patch your personal devices. Because you’re executing Javascript from just about everywhere, if they can deliver you Javascript that does this attack, they can pull stuff out of your, for example, password manager or kernel memory, these sorts of things. Yeah, when the update comes — I think it already has been delivered on most Microsoft — some of them. They had to actually withdraw it, because there were some bugs in the patches. But eventually, you’d want to apply these patches on your own personal computers.

Dustin Driver: Does this mean that we should all be waiting eagerly for the next release of the Intel or AMD processors, and have our checkbooks ready to buy a new machine? Does this render all of the existing hardware useless, or is it just use at your own risk?

Wu-chang Feng: No, I wouldn’t do that! I’m not chomping at the bit to buy a new CPU, but I will pay attention to when it’s weaponized, if it’s been weaponized, and I will be installing the updates, because there are software mitigations that will make this harder to pull off. I would pay attention to that. I think right now, if the Microsoft fix and the Linux fixes have been pushed out for both of these vulnerabilities, I feel confident in keeping with that. As a habit, people should just be updating all the time, and I think we’re ahead of the curve in terms of both of these vulnerabilities, in terms of the exploits versus the patch level. As far as I can tell, I think we’re okay right now. There’s no need to sound a bunch of alarm bells. The cloud providers are doing what they need to do. It’s going to cost them money. As long as we patch these things — definitely for the Meltdown one, a mandatory patch that everyone should do, and that’ll fix that issue. The Spectre one, I think they’re still working on some mitigations, but they have some initial ones. I would do those. It wouldn’t be urgent, just because that one seems much harder to do. In fact, the Project Zero folks released one of their exploits, and it took them a long time to engineer that exploit. With that level of effort, I feel a little more confident that this is not just something some person in a basement can execute.

Dustin Driver: And Project Zero, they’re security experts, they’re computer scientists. They’ve been working on this for years, and their goal is to make computers safer, essentially. They’re trying to break machines, break software, in order to engineer ways to make them safer.

Wu-chang Feng: And they are the elite. I consider them to be the elite. I would say they’re the top of the hacking food chain right now. But you never know. Some of these governments probably have a bunch of talent in there trying to break things. Maybe nation-states have the capacity to develop and weaponize these things quickly, but it seems like it takes a lot of time and engineering effort to actually pull this thing off.

Dustin Driver: Scary stuff. This is just one in a long line of vulnerabilities — the question is what makes this different? There have been multiple software vulnerabilities, and systems are hackable. People are actually hackable. I think that in popular culture we’ve all learned how easy it can be to hack someone, just from shows like Mister Robot. What makes this so different, and why has it captured so much attention?

Wu-chang Feng: Just because it targets a mechanism that nobody has targeted before. But it’s a hard exploit. It’s a hard thing to target, and it’s super, super involved. Most hackers probably wouldn’t even go through the trouble of doing this, because it’s much easier and much cheaper to hack the person. Just phish the person, get their password, and then you get everything you need to out of the lower-hanging fruit to get access. That’s why I’m not too concerned in terms of this thing taking off and compromising this, that, or the other. I think you’re right, we have much more serious vulnerabilities that we need to take care of, like default usernames in passwords, people reusing passwords, people falling for the phishing attacks — these are sort of things that are happening all the time, everywhere. It’s much more beneficial for us to spend our time working on mitigating those right now than to be worrying about Spectre and Meltdown, which I think might affect a tiny, tiny percent of people. Basically zero percent right now, because I don’t think anyone has actually done an exploit. I agree with your thinking.

Dustin Driver: Use a strong password. I think that’s a better way to stay safe. Don’t buy a new computer, just use a strong, alphanumeric password.

Wu-chang Feng: We run high school summer camps, and the things that we talk about are continually updating your software, we teach them how to use a password manager and get unique, strong passwords on everything that they have, and we talk about phishing attacks, and training them to identify phishing. These are things that I think are going to be much more effective than having them focus on the vulnerability of the day. It makes for good news articles, but in terms of shifting the bits in security, I don’t think it does as much as actually training people on these other vectors that people are continually getting compromised with.

Dustin Driver: Right. Another reason this has been such a surprise is it does require chip-level re-engineering, and that could be a first. I don’t remember that ever being the case before in any security flaws.

Wu-chang Feng: Yeah! I can’t think of a time. Oh, the other bug that’s out there is the Intel AMT chip bug, which was really serious. They did things like, if you hit username and a null password, it would let you in. So there are some crazy things that got baked into hardware, but that’s not the core CPU. This was sort of like a different management engine on the CPU. But yeah, these things are really rare, and it’s really rare because it affects every single CPU. I guess that might be why it’s taking off. Well, the other thing is the New York Times really sensationalized it. That probably wasn’t helpful either. Because the impact of this thing — yeah, 5 – 15% of extra capacity, that’s quite an impact, but the scaremongering was a little bit unnecessary.

Dustin Driver: So it’s more of an economic impact than it is a security impact right now. Actually, it’s all economic impact. That 10 – 15% extra capacity, which is a big deal when you’re running a lot of computing power. That’s a big deal. We’ll have to see how that plays out. Well, that’s really great.

Dustin Driver: So, Spectre and Meltdown. It actually turns out that we won’t need to put our laptops in the trash compactor. Well, maybe I will. It’s been acting up lately. But anyway, for now, it seems that we’re safe. There are software patches, there are mitigations, and there’s a whole new generation of chips coming out that are going to be engineered to get around the Spectre and Meltdown attacks. So that’s good news. Of course that doesn’t guard against any other type of attack, so remember, stay safe, use a password manager, use very long, very complicated, very nonsensical, and almost completely impossible to remember passwords, and whatever you do, do not trust a Nigerian prince. Well, maybe if you meet him in person, but that’s different.

Dustin Driver: Alright, that’s it. Thanks a lot. You can find out more about me at dustindriver.com. You can learn more about Spectre and Meltdown at the awesome site meltdownattack.com, by Graz University of Technology in Austria. The site has a very simple breakdown of both exploits and also some technical white papers, if you are so inclined.

Dustin Driver: Music is from epidemicsound.com. Go check them out. They have a huge variety of really excellent, royalty-free music. Thanks again, and tune in next time.

Leave a Reply

Your email address will not be published. Required fields are marked *