M5 Pro and Max unveiled

The adjective übereinanderliegende definitely implies that they overlap vertically, though I’ll defer to our native speakers.

This could already be wrong in the original German, but "übereinanderliegend" means more directly stacked on each other than just overlapping (which would be "überlappend" in German; yeah sometimes English and German are very similar).
 
This could already be wrong in the original German, but "übereinanderliegend" means more directly stacked on each other than just overlapping (which would be "überlappend" in German; yeah sometimes English and German are very similar).
Yeah, in my mind I translate as “laid over each other” and then in patent law I argue about whether or not they have to touch or just be above one or another, because that’s what patent litigators do all day.
 
It means something like:

The M5 Pro and M5 Max use a new fusion architecture that connects two superimposed dies (these are carrier plates for the transistors made of silicon).
That seems compatible with my suggested interpretation that they misunderstood or poorly described two chiplets each "stacked" on a substrate, but not each other. But it's not proof at all, either.
 
Yeah, in my mind I translate as “laid over each other” and then in patent law I argue about whether or not they have to touch or just be above one or another, because that’s what patent litigators do all day.

Of course patent law has to be a lot more precise. And in theory you could just have 100% overlap.
Since this is from an interview that seem to already have been translated from English to German, I wouldn't put too much weight on the wording.

Off-topic: Reminds me of the story, where someone automatically translated "The spirit is willing but the flesh is weak" from English to Russian and then back again, arriving at: "The whiskey is strong but the flesh is rotten."
 
Interview in German (I used Safari to translate) with Apple employees, including Anand! The topic is the new Performance core, the new Fusion architecture and a couple of other things..
Thank you for the interview. It's good to hear exactly why Apple decided to change the name of their top tier core. I love hearing interviews about products. It makes sense to me
 
What’s strange is the standard test show the 14” take a beating, but the extended one is more equal. I don’t know what to make of it.

Edit I see you answered this above!

Edit 2 hmmm still confused.
So he's published PugetBench results for the 16" Max:


It also shares the same issue on the Lightroom Class Standard bench, as in it's better than the 14" Max, but still not close to the 16" Pro. Further, all 3 Extended scores are pretty much the same, within a couple of percent. To riff on your idea that he swapped reporting the scores for the Pro/Max (I don't think that's the case), I'm wondering if he accidentally ran the Extended version twice on the Pro chip and never ran the Standard version? 12739 is a little higher than the other M5 Max/Pro Extended version scores but I don't know the variability of this benchmark and it's only 2.5% higher than the next highest Extended score, but it's 20% higher than the M5 Max in the same 16" chassis for the Standard test. It looks to me like he maybe ran the Extended test twice on the Pro chip.
 
Guessing they're just re-named, because even Apple's efficiency cores are more performant than Intel's P cores.
On what planet? Because it's not on this one. Apple's E cores are extremely good, but they're very far from *that* good. And yes, they use a tiny fraction (<10%) of the energy (and power), but that's not the claim here.

It's been pretty well established that the M (aka new "P") cores are NOT just a rename, and averaging roughly 70% of the Apple P core's performance, they do indeed get fairly close to Intel's best P cores - within maybe 10-12%, give or take. But they too don't beat them on pure performance.
 
With TSMC's FinFlex (and GAAFlex on N2), it seems like you could have a core that is a little beefier than a traditional e-core that could be more efficient with smaller gates and more performant with strategically placed larger gates (which are faster but draw a bit more power). It would simplify the implementation, since you would not have to muck around with designing a third core (which probably would not be that difficult, but not having to do so would save a lot of work). I am not sure if this sort of thing would work, but at the level that e-cores have reached, it might be just enough.
 
With TSMC's FinFlex (and GAAFlex on N2), it seems like you could have a core that is a little beefier than a traditional e-core that could be more efficient with smaller gates and more performant with strategically placed larger gates (which are faster but draw a bit more power). It would simplify the implementation, since you would not have to muck around with designing a third core (which probably would not be that difficult, but not having to do so would save a lot of work). I am not sure if this sort of thing would work, but at the level that e-cores have reached, it might be just enough.
I don't think that would work. The physical size of the gates is different. So either there wouldn't be room for your gates in the performance variation (impossible design), or there'd be wasted space in the efficiency variation (space-inefficient design). Neither outcome is desirable.
 
I don't think that would work. The physical size of the gates is different. So either there wouldn't be room for your gates in the performance variation (impossible design), or there'd be wasted space in the efficiency variation (space-inefficient design). Neither outcome is desirable.
I don’t even understand the proposal. These are standard cell designs, largely. For any given logic cell, there are multiple variations, each corresponding to a different effective strength. For a two input nand gate, you have nand2x1, nand2x2, nand2x4, etc. As the number after the x increases, so, too, does the physical size of the gate, the power consumed by the gate when it switches, and, all
Else equal, the speed of switching the gate.

This has always been the case - in the old days it was done by drawing transistors with bigger width-to-length ratio. Now it is done by using more and/or bigger fins.

It has always been the case that a core will use a mixture of all these different strengths, depending on the needs of a given logic path.

When I started at Exponential Technology, they were taping out their first chip, and they ran their design through a tool they developed called “Hoover.” This tool went path by path and reduced the size of gates that it thought could be shrunk in order to reduce power consumption while still meeting timing constraints. I think the samples arrived shortly after I joined. Instead of running at 533mhz, as designed, the he chip ran at 420mhz.

One of my first tasks was to figure out why, and I recall us printing out a giant schematic and me doing the Roth-d algorithm on it in pencil with a colleague to try and find a set of test inputs to figure out what went wrong.

Turns out Hoover didn’t account for cross coupling, so by making so many weak driving gates, wires were being heavily jerked around by signal changes on their neighbors, and which cost 20 percent of the chip’s performance. This became a subject of the article I wrote for IEEE Journal of Solid State Circuits about the chip, where I derived that the maximum relationship between gate sizes that you should have on two neighboring wires.

Anyway, not really relevant, but a fun story about “gate sizing.”

The real point is that it is already the case and has always been the case that the effective size of transistors in a core always differs from place to place, and doing it by fin count or fin size doesn’t really make too much difference.
 
Back
Top