M3 core counts and performance

I’m sure that rumor is false. I’d also bet there are no schematics for the chips. (There may be schematics for specific circuits, but the chip interconnectivity is probably in netlist form, not schematics).
Just for clarity: when you say you’re pretty sure the rumor is false are you referring to just the schematics being sold on the dark web or also that the M3 Max lacks an interconnect?
 
Just for clarity: when you say you’re pretty sure the rumor is false are you referring to just the schematics being sold on the dark web or also that the M3 Max lacks an interconnect?
Sorry - i meant the schematics being sold.

I think it’s quite possible that the M3 Max lacks the interconnect - I think I noted here earlier that apple seems to be doing things differently this time around.
 
If we're getting 2x Ultra out of this move I'll be quite pleased. I hope the Ultra and 2x are exclusively for the Studio and MP respectively. I think it'd be weird to continue with the Ultra in the MP. I do wonder if they'll continue with 2x the core config of the Max with this move or if it'll be something like 1.5x CPU and 1.5x GPU.

Aye could be anything. I mean usual caveats apply we don’t know anything yet, etc … but I agree with you and if Apple does do that it could turn the M3 Mac Pro into a really compelling product, especially for 3D graphics professionals, with the ray tracing and all that VRAM plus of course the CPU power. Also they could then buff the PCIe lanes for more internal communication. Depending on how it’s priced it could be exceptional value compared to competing PC workstations. If they could bring such a thing in under $10k base price … I’m getting ahead of myself and I don’t want to set myself up for disappointment but I think Apple has an opportunity here in this space to do something dramatic.
 
Not sure how I feel about the Ultra possibly not being 2x Max. On the one hand, it means it could be even better, on the other hand, it could be worse! While the Pro is a good chip, I wouldn’t like the Ultra to have it cpu or gpu to be reduced in terms of core count when compared to the M2 Ultra.
 
Not sure how I feel about the Ultra possibly not being 2x Max. On the one hand, it means it could be even better, on the other hand, it could be worse! While the Pro is a good chip, I wouldn’t like the Ultra to have it cpu or gpu to be reduced in terms of core count when compared to the M2 Ultra.
It depends of course. But an M3 Ultra with say a 1.5x M3 Max would be the same CPU core count (more P-cores!) as the M2 Ultra with a lower GPU core count but no interconnect and maybe better performance characteristics to make up for it. Depending on the price could be still be fine to just as good generation value. But who knows? If they have done this, then the characteristics of the M3 Ultra could be a different proportion altogether. The GPU will be the primary driver of how big the die would be and thus how expensive a monolithic chip would be … and how expensive two glued together would be.
 
Sorry - i meant the schematics being sold.
What is the likelihood of them all being in one place? The layout for a P core might be in room 431, the GPU core in 433, other cores and subunits in adjacent locations, and the SoC layout, using subunit dummy masks in room 353, so that obtaining the exhaustive schematic would be very difficult. Is that how it probably works?
 
What is the likelihood of them all being in one place? The layout for a P core might be in room 431, the GPU core in 433, other cores and subunits in adjacent locations, and the SoC layout, using subunit dummy masks in room 353, so that obtaining the exhaustive schematic would be very difficult. Is that how it probably works?

It’s all digital, stored on a server someplace. After tapeout the whole thing gets archived. Copies of things live on individual engineers’ machines as well. Stuff only gets printed out to work on it and then it’s tossed. We used to use CVS to version everything. And, again, we had very few schematics - only for things like RAMs and I/O drivers and clock circuits and such. The rest was in netlist format, RTL, etc.
 
It’s all digital, stored on a server someplace. After tapeout the whole thing gets archived. Copies of things live on individual engineers’ machines as well. Stuff only gets printed out to work on it and then it’s tossed. We used to use CVS to version everything. And, again, we had very few schematics - only for things like RAMs and I/O drivers and clock circuits and such. The rest was in netlist format, RTL, etc.
Don’t help @Yoused plan his heist! 🙃

What is the likelihood of them all being in one place? The layout for a P core might be in room 431, the GPU core in 433, other cores and subunits in adjacent locations, and the SoC layout, using subunit dummy masks in room 353, so that obtaining the exhaustive schematic would be very difficult. Is that how it probably works?

Innocently asking what room(s) the plans might be in indeed!

Tim Cook would be proud:


 
If they have done this, then the characteristics of the M3 Ultra could be a different proportion altogether.

Let’s say they do this. If they follow the rough pattern set out by the M3 -> M3 Pro -> M3 Max, then the M3 Ultra might look something like this:

24 CPU cores (20 P-cores)
72-80 GPU cores

as the ratio of CPU/GPU cores for each die is 1.5x CPU cores (maintaining 4 E cores) and on average 2x GPU cores (1.8/2.2x).

This is similar to @Yoused ’s predictions in @Altaic ’s thread that something appears to be different about M3.
Given that Max is no longer an extended Pro, the Ultra (or whatever) might turn out to be something other than a linked pair of Maxes. It might serve them better to go with 24 (20P 4E) cores instead of 32.

However @Cmaier ’s caution that such a die would be very large still applies.
Not sure if they could make an M3 Ultra that matches M2 Ultra without using an interposed - the reticle size may be too small to allow a die big enough.

Even if the reticle size is big enough, that’s an expensive chip.

Edit: we don’t know exact mm because there are no 3rd party die shots yet, but for reference, an M3 Max manufactured on N3 at 92 billion transistors has more transistors than a full Hopper GPU at 80 billion transistors on N4. Hopper has a die size of 814mm^2. The reticle limit for EUV is roughly 858 mm^2. While N3 came with density improvements for logic, some things like cache don’t scale that well. So an M3 Max is already massive. I’m not saying a monolithic M3 Ultra is impossible, but it would be a very, very big boy … and oh so close to that limit if it can be done at all.

Edit2: I tracked down an estimate (and it was only an estimate) that an M2 Max on N5P was 550mm^2. It had 67 billion transistors. So an M3 Max has 37% more transistors. For the M3 Ultra to fit in the reticle limit an M3 Max would probably have to be a smaller die than an M2 Max. The theoretical best shrink for pure logic from N5P to N3B is 60-70%. SRAM and I/O, each of which Apple has a lot, won’t shrink like that. Eoof. Maybe doable but tough and expensive. Someone estimated that it costs Nvidia $3300+ (not all of that is the die but still) to make an H100 which they sell for $30,000+. Even if it’s possible to make, I’m not sure a monolithic M3 Ultra is practical as much as I would like it to happen. I dunno, definitely not ruling it out, but don’t expect price cuts. If these estimates are right, it would roughly cost Apple as much to make a monolithic M3 Ultra SOC as they currently sell a Mac Studio Ultra for …
 
Last edited:
Since I seem to be missing the discourse: what are these rumors of no Ultra Bridge in M3 Max based on? Are high-quality die shots available? Or is this all about the leaked blueprints that nobody has seen?
 
Since I seem to be missing the discourse: what are these rumors of no Ultra Bridge in M3 Max based on? Are high-quality die shots available? Or is this all about the leaked blueprints that nobody has seen?
As far as I can tell, no 3rd party die shots are available. It’s not clear why no one has done one. I thought die shots of previous M processors were available by now but maybe I’m misremembering. Hector also hasn’t released anything about the IRQ controller.

@thenewperson linked to a Twitter account claiming that the Max doesn’t have the bridge but that account shared no details on why it “knows” that. I don’t know that account nor could the @thenewperson vouch for it. So I treat it as just a rumor.
 
As far as I can tell, no 3rd party die shots are available. It’s not clear why no one has done one. I thought die shots of previous M processors were available by now but maybe I’m misremembering. Hector also hasn’t released anything about the IRQ controller.

@thenewperson linked to a Twitter account claiming that the Max doesn’t have the bridge but that account shared no details on why it “knows” that. I don’t know that account nor could the @thenewperson vouch for it. So I treat it as just a rumor.

M3 Max die shots (although the quality leaves something to be desired, the UltraFusion interconnect appears to be absent):
 
Okay some actual data! … albeit a little hard to see anything. I can sort of make out the interconnect on the other two and it’s absence on the third. I’m not very good at “reading” these things anyway. I was not able to find these pictures and no one seems to reference them. Very annoying, so thank you for tracking this down!

However, from these die shots, we can also see that, assuming the sizes haven’t been adjusted, the M3 Max appears to be about as big, if not bigger depending on the shot, in mm^2 than the M2 Max.

So the possibilities are as follows:

1) only some M3 Maxes have the interconnect (ie the ones destined for Ultras) or

2) despite the huge size of the Max, that somehow a monolithic M3 Ultra has been designed to fit within reticle limits and is somehow still economical to produce and sell or

3) something crazy like Apple has a new way to connect the chips that isn’t apparent from die shots
 
Okay some actual data! … albeit a little hard to see anything. I can sort of make out the interconnect on the other two and it’s absence on the third. I’m not very good at “reading” these things anyway. I was not able to find these pictures and no one seems to reference them. Very annoying, so thank you for tracking this down!

However, from these die shots, we can also see that, assuming the sizes haven’t been adjusted, the M3 Max appears to be about as big, if not bigger depending on the shot, in mm^2 than the M2 Max.

So the possibilities are as follows:

1) only some M3 Maxes have the interconnect (ie the ones destined for Ultras) or

2) despite the huge size of the Max, that somehow a monolithic M3 Ultra has been designed to fit within reticle limits and is somehow still economical to produce and sell or

3) something crazy like Apple has a new way to connect the chips that isn’t apparent from die shots
I think you're missing

4) M3 Ultra will be built by fusion-ing two copies of a die designed from the ground up to be M3 Ultra and nothing else

Reasoning: we already know that unlike previous generations, you can't crop M3 Max artwork to make M3 Pro. This cost Apple an extra tapeout, but enabled them to tailor the mix of CPU and GPU and IO better for specific market segments. Perhaps Apple also believes that two times what Max has isn't the optimal way to build an Ultra. Previous Ultras arguably have too many copies of some things (I doubt many people need as many video codecs as M1/M2 Ultra provide), and arguably not enough of others (PCIe IO comes to mind). M2 Ultra is an okay fit in the Mac Studio, but clearly not ideal (in my mind anyways) for Mac Pro. This approach would fix that problem.
 
I think you're missing

4) M3 Ultra will be built by fusion-ing two copies of a die designed from the ground up to be M3 Ultra and nothing else

Reasoning: we already know that unlike previous generations, you can't crop M3 Max artwork to make M3 Pro. This cost Apple an extra tapeout, but enabled them to tailor the mix of CPU and GPU and IO better for specific market segments. Perhaps Apple also believes that two times what Max has isn't the optimal way to build an Ultra. Previous Ultras arguably have too many copies of some things (I doubt many people need as many video codecs as M1/M2 Ultra provide), and arguably not enough of others (PCIe IO comes to mind). M2 Ultra is an okay fit in the Mac Studio, but clearly not ideal (in my mind anyways) for Mac Pro. This approach would fix that problem.
That’s a definite possibility. To be fair I had that thought of this version when writing possibility 1) but omitted it for some reason. Earlier in another post I wrote:

Other possibilities? Hmmm … maybe a bifurcation between laptop dies and desktop Max dies? After all the laptop ones don’t need the interconnect and maybe there is something about the desktop dies makes having a difference here worth it from an economic perspective.

I was still thinking if at as a “Max” die but with other differences to make it a “desktop” Max**. At the time I thought it unlikely but now …

I wonder what the cost would be? Certainly lower than a monolithic die of course, but the nice thing about fusing 2 regular Maxes is that you’re already producing them at volume. I would also guess that laptop Maxes outsell Studio Maxes. Still, my first possibility where it is identical to the laptop except with an interconnect, which is the cheapest of my original 3, largely shares that issue though without the additional engineering costs of designing a new die. However, unlike previous generations Apple just released 3 SOCs in one go all with unique floor plans. So maybe releasing a unique (or even two unique dies*) for the Ultra is a cost they can afford now? They must be doing something and I have to admit this sounds the most likely.

*I’m stuck awake at 4am so maybe this is sleep deprivation but perhaps an interesting wrinkle to this idea is fusing 2 new unique dies since a benefit of not using Maxes would be that the fused dies could be asymmetrical in capabilities. Don’t want to double video codecs or NPU? Put them on one die only! But that would probably raise costs further both design and manufacturing. Still could be worth it?

Edit: ** or if it is a “desktop” Max maybe it will be the version that is used in the desktop Studio? That would help the volume aspect of it - the dies are not just being produced to be fused into extremely low volume Ultras, they are themselves used in a Studio Max. Compared to the laptop you could add things like higher clocks … more I/O … but likely couldn’t reduce anything which was a selling point of your version … still could be interesting? So that even the Max Studio benefits?
 
Last edited:
I mean, I'm sure Apple has done the numbers here. I have no idea what the cost for putting the Ultra Fusion interconnect on all M3s might have been, but by the assumption they sell a lot more Max dies than Ultra chips, it's possible it might just make more financial sense to not have it on all the Maxes but instead push more of the cost on the smaller subset of users buying Ultras - The Ultras would get more expensive/lower margin as they are now semi-bespoke Maxes, but regular Maxes all get cheaper to produce/higher margins
 
Back
Top