M2 Pro and M2 Max

AMD isn't using advanced packaging tech like Apple, just conventional organic substrates. The IFOP (Infinity Fabric On Package) links connecting their Core Complex Die (CCDs) to the IO die are narrow and high clock rate per wire, meaning they require SERDES.

It's a bit harder to be certain about Apple's setup as they give out even less detail than AMD, but the numbers we do have say their interface is very wide and slow. On the one hand, power per bit lane should be way lower with Apple's approach - AMD's packaging means transmission lines are longer and should require a lot more energy to yank around, and needing a SERDES for each wire costs both silicon and power. On the other hand, Apple has a lot more wires in their die-to-die interconnect. I have no great feel for who wins here.

Apple arguably had a tougher problem to solve: they had to make their solution support splitting the GPU in two. In AMD's Zen desktop products, they don't have as large a GPU, and it's always completely contained within the IO die (which is where the memory controllers live), so no need to pollute the IFOP links with GPU traffic. While Apple's GPU cores probably don't need to talk to each other much, they should need to talk to memory attached to the other die quite a bit.

Yet another difference between the two systems is that there is no local DRAM memory controller in a CCD. So, AMD's CPUs always have to go off-die for cache misses, and IFOP links are almost certainly higher latency than Apple's interconnect. The tradeoffs here are just so very different.

As an aside, in this realm, people usually measure in units of picojoules per bit transported from one chip to another. Kind of a neat unit.


My intuition, for what it's worth, is that it probably isn't Ultra Fusion. UF seems intentionally overengineered, as if Apple wanted to make certain that if there were problems scaling their GPU up to be huge, the interconnect wouldn't be one.

Informative, thanks @mr_roboto!
I don't think there's much pure forum posting can do to provide definitive answers. Best approach I can think of is acquiring three M1 systems (pro, max, and ultra), getting real familiar with Metal, and going to town writing microbenchmarks. Wherever you find things which don't scale the same from Pro to Max to Ultra, you can investigate with Apple's performance monitor / counter features, which I've heard are pretty good. Apple provides these tools to enable developers to figure out performance bugs in their code, but they should also be able to provide some amount of insight into why some particular thing doesn't scale as well from Max to Ultra as you'd expect
Yeah I’ve been hoping for more tests from reviewers but of course financially that isn’t plausible as it requires making a lot of tests on now old products which even if you do get interesting results (not guaranteed) isn’t going to draw eyeballs as much. Well maybe now that the M2s are out more than just MaxTech will revisit.
 
Last edited:
I don't think there's much pure forum posting can do to provide definitive answers. Best approach I can think of is acquiring three M1 systems (pro, max, and ultra), getting real familiar with Metal, and going to town writing microbenchmarks. Wherever you find things which don't scale the same from Pro to Max to Ultra, you can investigate with Apple's performance monitor / counter features, which I've heard are pretty good. Apple provides these tools to enable developers to figure out performance bugs in their code, but they should also be able to provide some amount of insight into why some particular thing doesn't scale as well from Max to Ultra as you'd expect.
The microbenchmarks might not even be needed. Even a single frame capture of something like Apple’s sample code Modern Rendering with Metal would go a long way towards understanding the issues with M1 Ultra scaling (the profiling tools are indeed quite awesome). Although I suspect that sample code would run close to the ideal scaling.

The fact that M2 Pro/Ultra seem to scale better are a second data point but, as you say, not enough to draw any conclusions, other than that memory bandwidth wasn’t the issue (as that hasn’t changed). But it does seem like whatever it is would show up nicely in the frame capture, as there are several cues that something is hard limiting the scaling (like those figures showing that M1 Ultra got nowhere near using 2x the power of a M1 Max).
 
From https://www.ifixit.com/News/71442/tearing-down-the-14-macbook-pro-with-apples-help :

"The [16 GB] M1 Pro has an 8GB Samsung LPDDR5 RAM module on either side of the core while the [16 GB] M2 Pro has two SK Hynix 4GB LPDDR5 RAM modules on either side of the core—a total of four. These are the very same RAM modules we found in the M2 MacBook Air....But why is Apple using four RAM modules this time around instead of two larger ones? I posed this question to Dylan Patel, Chief Analyst at SemiAnalysis, who said 'ABF substrates were in very short supply when Apple made the design choice. By using four smaller modules rather than two larger ones, they can decrease routing complexity within the substrate from the memory to the SoC, leading to fewer layers on the substrate. This allows them to stretch the limited substrate supply further.' "

[To allow easier comparison, I edited ifixit's pic to orient them both in the same direction.]



1675145590036.png


Aside from the obvious RAM difference, the M2 has lots of extra components at top center and bottom center (inside the regions with the curved boundaries). Anyone know what these are?:


1675146119565.png



1675146071240.png
 
Last edited:
Aside from the obvious RAM difference, the M2 has lots of extra components at top center and bottom center (inside the regions with the curved boundaries). Anyone know what these are?:
The circuitry inside those regions is a set of point-of-load switchmode power supplies (SMPS), or what the PC world often describes as VRMs. They convert battery or external DC power down to the ~1V DC (or less) power rails used by the big high performance chips.

The extra components in the M2 version of these power supplies look like electrolytic capacitors, which are used in SMPS essentially to stabilize output voltage against rapid changes in current demand from the load. Why'd they use more this time around? I'm not a real power supply designer, but I know just enough to say that there's probably lots of possible reasons and most of them are quite boring.

One thing I can tell hasn't changed: there are three main SMPS supplies for both M1 and M2. These don't look like multi-phase supplies, and each single-phase SMPS output stage uses one inductor (the big two-terminal components with R22 printed on top). Both M1 and M2 have a total of three such inductors inside the partially curved boundaries, so that's three distinct outputs.

You might be wondering what those boundaries are. They're just metal shields soldered to the board. Ordinarily, there would be a piece of sheet metal (or perhaps a metal foil sticker) on top; you can see the glue residue from iFixit pulling the covers off. Apple seems to like putting a full Faraday cage around high-power SMPS circuits to contain electromagnetic interference (EMI). There's two main reasons I can think of for going to this much trouble: regulatory compliance, and improving BT/WiFi radio performance.
 
PassMark testing also reports a frequency difference between the M2 Pro (3.5 Ghz) and M2 Max (3.7 GHz).

I'm surprised no one has devoted an article or youtube video to this, since it does seem interesting.

1675669693302.png

 
Last edited:
PassMark testing also reports a frequency difference between the M2 Pro (3.5 Ghz) and M2 Max (3.7 GHz).

I'm surprised no one has devoted an article or youtube video to this, since it does seem interesting.

View attachment 21692

I've seen this around a few places; Always as side-notes, but I've seen it around. It's rather interesting. Yet another thing making me curious about the M2 Ultra
 
Also, I looked through the first three pages of recent GB results for both the M2 Max and M2 Pro, picked the three highest SC scores for each (which should be those run under optimal conditions), and averaged them. That gives a difference of {2076, 2075, 2075}/{1971,1970,1970} = 5.03%, which is quite close to the 3680/3480 = 5.07% expected based on the frequency difference.

[Note: The higher clock is confined to the 16" M2 Max specifically (14,6).]
 
Last edited:
I dunno where I saw this so I could be wrong, Apple is doing single core boost now at least on the 16" with high power mode. Supposedly only the full M2 Max (on the 16") runs at the higher clock though. I've not really seen it talked about so I am not sure if this is true.
 
I dunno where I saw this so I could be wrong, Apple is doing single core boost now at least on the 16" with high power mode. Supposedly only the full M2 Max (on the 16") runs at the higher clock though. I've not really seen it talked about so I am not sure if this is true.
Yeah, based on the model nos. the higher clock appears to be confined to the 16" Max (14,6), so I'll add a note about that to my post. But I suspect that it's independent of high power mode, since essentially all the 16" M2 Max results I saw showed 3.7 GHz clock, and I suspect at least some didn't bother to turn on that mode before running GB. From my earlier post on this:

 
Back
Top