New side-channel attack against apple CPUs

jbailey

Power User
Posts
170
Reaction score
187

Memory prefetching is the problem.
It will be interesting if someone can get a browser and Javascript to run the attack but otherwise, it doesn't seem very troubling. I think I would notice if something was running in the background and taking over one of my clusters for a minimum of 26 minutes.
 

mr_roboto

Site Champ
Posts
291
Reaction score
469
This has the feel of yet another academic security research paper where yes, they found a side channel attack, but it probably isn't practical to exploit in the real world. There have been a lot of papers like that which resulted in no mitigations.
 

jbailey

Power User
Posts
170
Reaction score
187
This has the feel of yet another academic security research paper where yes, they found a side channel attack, but it probably isn't practical to exploit in the real world. There have been a lot of papers like that which resulted in no mitigations.
It’s important because we don’t know if there are simpler attacks. I just don’t think this version is going to be a risk for Mac users but the research is important.
 

tomO2013

Power User
Posts
101
Reaction score
182
They only mention that the fetching behaviour affected can be disabled on M3 by a special bit, but it sill suggests that M3 is fundamentally affected also just possibly to a lesser extent and to a lesser performance penalty.

@Cmaier : Appreciating that you are not working at Apple , but really just interested in getting your experienced take on this one...
At this stage in M4's development schedule (assuming readiness for a November release time frame) or even iPhone A18 in September / October time frame, would it be too late in the day to make small micro architectural security tweaks to the next apple silicon generation chips (assuming a fix is relatively small)?
 

Cmaier

Site Master
Staff Member
Site Donor
Posts
5,349
Reaction score
8,551
They only mention that the fetching behaviour affected can be disabled on M3 by a special bit, but it sill suggests that M3 is fundamentally affected also just possibly to a lesser extent and to a lesser performance penalty.

@Cmaier : Appreciating that you are not working at Apple , but really just interested in getting your experienced take on this one...
At this stage in M4's development schedule (assuming readiness for a November release time frame) or even iPhone A18 in September / October time frame, would it be too late in the day to make small micro architectural security tweaks to the next apple silicon generation chips (assuming a fix is relatively small)?

It’s hard to know because I don’t know what their tape-out lead time is and how long it takes them to turn a chip iteration. I would assume that they’d need working early lots in september or so. Tapeout until rocket lots used to be about 3 months. You wouldn’t want to make anything other than critical bug fixes 6 months before you expected to ship die.

At AMD we were cowboys. On the day of tapeout, probably for the original opteron (it’s been so long I can’t remember), one of the units needed a bug fix and we figured out how to implement it by moving some metal around. In order to accomplish that, I opened the mask in vim and hand-edited the file to move rectangles of metal around. I imagine Apple wouldn’t take that sort of chance, and would make the changes in the original source files, run the source files through the tools, etc. I would imagine at least a week to figure out how to fix the problem optimally, and 2 weeks to implement and verify the change.
 

dada_dave

Elite Member
Posts
2,174
Reaction score
2,171
Hector Martin believes that the same bit that formally exists in the M3 to turn the DMP off almost certainly exists as a “chicken bit” in the M1/2. That doesn’t mean a patch is trivial but it should be doable at least to the extent that the M3 may not be vulnerable.

 

Nycturne

Elite Member
Posts
1,141
Reaction score
1,492
It'll be interesting to see if they can find the chicken bit for this. Now I'm curious to see how this plays out. It sounds like the researchers may even be able to give Hector Martin a repro to try on Asahi to see if they can track down the chicken bit.
 

tomO2013

Power User
Posts
101
Reaction score
182
I'm going to put my tinfoil hat here for a second but when I learnt of Spectre on x86 (how long it had been present) and possibly to a lesser extent this GoFetch vulnerability on Apple Silicon I honest wondered if this was an intentional backdoor for security agencies...
Now before anybody says that I'm off my rocker, let me just point out that there is a history of this with the CIA/NSA in US and MI5/MI6 UK.

To be honest, at least with spectre I wouldn't be at all surprised given that Windows is the dominant desktop operating system.
 

Cmaier

Site Master
Staff Member
Site Donor
Posts
5,349
Reaction score
8,551
I'm going to put my tinfoil hat here for a second but when I learnt of Spectre on x86 (how long it had been present) and possibly to a lesser extent this GoFetch vulnerability on Apple Silicon I honest wondered if this was an intentional backdoor for security agencies...
Now before anybody says that I'm off my rocker, let me just point out that there is a history of this with the CIA/NSA in US and MI5/MI6 UK.

To be honest, at least with spectre I wouldn't be at all surprised given that Windows is the dominant desktop operating system.
I’m 100% sure it wasn’t intentional. The entire time I was designing CPUs, it never occurred to us to even look for side-channel attack vectors. I never even heard of side-channel attacks until my last year at AMD, and that was only because I sat in on a hearing involving patents on preventing side channel attacks. Unless you were designing an embedded crypto processor, you just weren’t even considering the issue, let alone making your CPU less efficient in order to avoid the problem. Unless you did so by accident, like if you were designing in full-differential logic (and that wouldn’t have stopped Spectre-style attacks).

It wasn’t until the early 2000’s when CPUs even became complicated enough to let some of these sorts of attacks work, so it just wasn’t on anyone’s radar.
 

Nycturne

Elite Member
Posts
1,141
Reaction score
1,492
I'm going to put my tinfoil hat here for a second but when I learnt of Spectre on x86 (how long it had been present) and possibly to a lesser extent this GoFetch vulnerability on Apple Silicon I honest wondered if this was an intentional backdoor for security agencies...

It's a pretty poor back door. It looks like strong tinfoil territory to me.

But those things you linked? Absolutely. That's generally the approach I would take as a nation state actor: Be the MitM. And it's a clear pattern. If you "own" the encryption, then you get to use all the existing wiretapping experience on whomever you want. It seems more the style to use 0-days if going after specific actors (Stuxnet for example). If the CIA could use this exploit against a valuable target, they would. And I have zero doubt they are collecting exploits to use. I just don't think they are asking for these to be introduced.
 

Yoused

up
Posts
5,636
Reaction score
8,969
Location
knee deep in the road apples of the 4 horsemen
They are being deliberately vague about how it works. I assume it is another timing attack like Spectre or Meltdown. My question is, can noise or reduced resolution be introduced into the timing register to make this data harder/impractical to discover? If they were to reduce the timing resolution to half a millisecond, it seems like most programs would be able to work with that and the malware would become ineffective. Just mask off the bottom of the timing register at EL0.
 

Andropov

Site Champ
Posts
620
Reaction score
780
Location
Spain
If I understood the paper correctly the basic idea is:
- Apple's memory prefetching thingy prefetches everything that looks like a pointer (under certain conditions)
- There are ways to tell if the target address has been prefetched, even from a different process that doesn't have access to that memory
- If they can get the internal state of a cryptography algorithm to resemble a pointer at a specific stage that may or may not be reached based on something that depends on the secret key, they can know if that stage has been reached, which provides information about the secret key
- It's possible to crack some algorithms with the knowledge above alone

If so, it's worrying but seems like it'd require, for every algorithm one is trying to crack, to devise a complex plan that may not even be possible depending on whether the inputs of the algorithm can be made to resemble pointers at extremely specific steps in the algorithm execution that leak enough information to reconstruct the key just by knowing if the step has been reached?
 

mr_roboto

Site Champ
Posts
291
Reaction score
469
If I understood the paper correctly the basic idea is:
- There are ways to tell if the target address has been prefetched, even from a different process that doesn't have access to that memory
As I understand it, this part's a classic cache timing side channel attack. Set up an array of memory locations, get it all in cache making sure to completely fill L1 with your array, ask the victim to do something for you, test array locations to see which ones are slower to read and therefore got evicted. From this, you can retrieve at least some bits of the addresses of memory references (or, in this case, prefetches) done by the victim process.
 

Andropov

Site Champ
Posts
620
Reaction score
780
Location
Spain
As I understand it, this part's a classic cache timing side channel attack. Set up an array of memory locations, get it all in cache making sure to completely fill L1 with your array, ask the victim to do something for you, test array locations to see which ones are slower to read and therefore got evicted. From this, you can retrieve at least some bits of the addresses of memory references (or, in this case, prefetches) done by the victim process.
Yes, that was my impression as well. Do you know how limiting that approach is in practice? I take that any other process perturbing the cache state can make the timing results meaningless, so I'm not sure how that's avoided. The paper, as it often happens, is very eloquent on the strengths of the method, not so much on its weaknesses 😅

In any case, looks like while the general case "can't be patched in software", any given implementation can be made immune to this side-channel attack. And they mention that Intel's 13th Gen has a similar prefetcher that could cause the same kind of vulnerabilities. So maybe cryptography algorithms will simply evolve to defend against this kind of attacks (the paper mentions a couple possible approaches).
 

mr_roboto

Site Champ
Posts
291
Reaction score
469
Yes, that was my impression as well. Do you know how limiting that approach is in practice? I take that any other process perturbing the cache state can make the timing results meaningless, so I'm not sure how that's avoided. The paper, as it often happens, is very eloquent on the strengths of the method, not so much on its weaknesses 😅
Yeah, that's the problem with this class of attack. They built attacker and victim processes for the demo which have to run for somewhere on the order of hours of CPU time to extract one key. I have to assume that time gets far worse when there's something else injecting noise into the side channel!

Also, when it comes to developing this technique into an exploit against Macs, many of Apple's secure system services are built around keeping secrets inside the Secure Enclave rather than ever letting them exist in main system memory. The SEP doesn't participate in Arm cache coherency and should be impossible to attack this way.

P.S. Just realized that I misspoke re: L1. If this is same-cluster only, that implies they're looking at timing effects in L2, the cluster-level shared cache.
 

leman

Site Champ
Posts
643
Reaction score
1,197
To me the public discussion around this sounds like a storm in a teacup. I just don't see how this exploit is practical. In the real world attackers don't have the ability to call the cryptographic function hundreds of times per second, they don't always know the precise algorithm of that function, and they cannot prevent other threads from modifying the L2 cache.

Frankly, I am much more worried about things like RowHammer where an attacker can actually influence the execution state without having the privilege.
 

Andropov

Site Champ
Posts
620
Reaction score
780
Location
Spain
Interesting but unconfirmed theory from someone I believe works or worked at Apple.
Meh, I think the paper is reasonably clear that not every algorithm/implementation is vulnerable. At no point did they try to imply that every cryptographic library was vulnerable, just that it's highly likely that there are more than the four they identified (which is definitely plausible). It's mostly the media that has blown the issue out of proportion.
 
Top Bottom
1 2