WWDC 2023 Thread

  • Thread starter Thread starter Cmaier
  • Anyone can edit the first post of this thread WikiPost WikiPost
  • Start date Start date
Status
The first post of this thread is a WikiPost and can be edited by anyone with the appropiate permissions. Your edits will be public.
Ternus seemed to be saying that PCIe cards allow higher-speed network connections than are available with TB4. I'm not familiar with that tech myself, but I did find some 4.0 x 16 lane PCIe cards capable of 16 GT/s = 128 Gb/s, which is far above TB4's 40 Gb/s (ignoring overhead in both cases). Is that a legitimate distinguishing use case for the Mac Pro?

And what about the ability to add large amounts of very fast internal storage using PCIe RAID cards, like this one: https://www.newegg.com/highpoint-ssd7540-pci-express/p/N82E16816115312?Description=nvme raid&cm_re=nvme_raid-_-16-115-312-_-Product Would this give significantly faster transfer speeds than are possible through the TB4 ports of the Mac Studio?
Yes but once you use it for that purpose you’ve basically used up all your available PCIe bandwidth.

I‘m sure there are some use cases. I’m sure some subset of those has people willing to pay an extra $3000. But, overall, seeing this thing gives me a case of Fremdschämen for Apple, which is a great German word we should all be using.

Looked it up and indeed.

In any event, stay tuned for M3 Extreme, which is where this box will make a lot more sense.

Yup, I hope so!
 
I‘m sure there are some use cases. I’m sure some subset of those has people willing to pay an extra $3000. But, overall, seeing this thing gives me a case of Fremdschämen for Apple, which is a great German word we should all be using.

In any event, stay tuned for M3 Extreme, which is where this box will make a lot more sense.
Certainly that would, particularly if Apple also switches to LPDDR5x and leverages that to offer much higher max RAM. [Neil Parfitt, a Canadian music producer, said he's got 768 GB RAM in his 2019 MP, and his resting template, which includes many orchestral audio samples, is 300 GB, so the AS MP wouldn't work for him.]

But I am curious about the added development costs of the M3 Extreme, which would be for the MP only. If you wouldn't mind repeating yourself (I think you've discussed this before), what kind of development costs is Apple looking at in connecting two Max's? I don't think the MP needs to make money--a killer M3 Extreme would be a halo product, and Apple would probably benefit even with minor losses (think of it as part of their advertising budget). However, I doubt Apple wants it to be a big money sink either.
 
  • M3 Extreme SoC
  • 3nm process
  • A17 cores
  • Hardware ray-tracing
  • LPDDR5X RAM
All this should make for a much more performant ASi Mac Pro...?

But for those who do not have a need for PCIe slots, maybe we see a M3 Extreme Mac Pro Cube...? ;^p
 
Certainly that would, particularly if Apple also switches to LPDDR5x and leverages that to offer much higher max RAM. [Neil Parfitt, a Canadian music producer, said he's got 768 GB RAM in his 2019 MP, and his resting template, which includes many orchestral audio samples, is 300 GB, so the AS MP wouldn't work for him.]

But I am curious about the added development costs of the M3 Extreme, which would be for the MP only. If you wouldn't mind repeating yourself (I think you've discussed this before), what kind of development costs is Apple looking at in connecting two Max's? I don't think the MP needs to make money--a killer M3 Extreme would be a halo product, and Apple would probably benefit even with minor losses (think of it as part of their advertising budget). However, I doubt Apple wants it to be a big money sink either.
I’m curious why people think it would be a money sink? I’ve seen this now stated a few times as though it is accepted fact. But every other chip maker has a similar “premium” chip that they then charge premium prices for. If anything, those tend to be the biggest money makers on a per unit basis never mind the halo effect. Should Apple expand the use of its chiplet design it would actually be better positioned than most in this regard as they wouldn’t even be creating dies specifically for the Extreme.
 
I learned years ago to stop researching when I start buying or I'd end up with too much buyers remorse. I know nothing about any ongoing wars. I put together a good machine. It handles whatever I've thrown its way so far. I built it to be as future proof as possible while still staying in my budget. I'm quite happy with my purchase. I don't see why specs would make me want to give anything up. If people want to fight about various spec or hardware or whatever, so be it. I don't need to be involved.
Back when I built my hotrod gaming machines, as soon as I'd finish one build, I'd immediately start planning the next. In fact, I think I spent more time researching what I'd be cobbling together in the future, than actually playing games in the present, which defeats the entire objective. Sure, part of that was a hobby, but it wasn't a fruitful one.

Constantly chasing the next best thing is a myopic hamster mentality; constantly running on the wheel, never reaching the other side of the cage. Eventually, you fall off from exhaustion. There's always something better around the corner, buy what you need for the job, and don't look back. Worrying about the future is self-destructive to the present.

One good thing about my personal decision to stay on the Apple ranch is the time between product releases. If the M-series are roughly on an 18-month release schedule, more or less, then the "next best thing" is further out, and there's only a single supplier to be concerned about, not dozens. I'm still guilty of chasing the other side of the cage, hence Apple's schedule puts a limit on that. That's why I typically make a purchase shortly after release, by the time the next M(x) chip is out, I'll be long set in my ways with the previous generation.
 
Certainly that would, particularly if Apple also switches to LPDDR5x and leverages that to offer much higher max RAM. [Neil Parfitt, a Canadian music producer, said he's got 768 GB RAM in his 2019 MP, and his resting template, which includes many orchestral audio samples, is 300 GB, so the AS MP wouldn't work for him.]

But I am curious about the added development costs of the M3 Extreme, which would be for the MP only. If you wouldn't mind repeating yourself (I think you've discussed this before), what kind of development costs is Apple looking at in connecting two Max's? I don't think the MP needs to make money--a killer M3 Extreme would be a halo product, and Apple would probably benefit even with minor losses (think of it as part of their advertising budget). However, I doubt Apple wants it to be a big money sink either.

The development costs wouldn’t be very high, and most of the changes to the SoC would benefit ultras and max’s too (things like increasing the off-chip bandwidth). There would also be some small portion of the Max that wouldn’t be used for anything (double the portion there is now) when those off-chip drivers aren’t used. So the cost to manufacture the Max is a little higher than it otherwise would be. Development-wise, since M3 is a new design, it has little impact; you are re-floor planning, anyway, so you just factor the necessary changes in from the start.

The main issue would be yield for the interposer and the fabric, so it’s more of a manufacturing cost (which will be reflected in the price).
 
The main issue would be yield for the interposer and the fabric, so it’s more of a manufacturing cost (which will be reflected in the price).
From what I gather, you're talking about another big ass die (using a technical term) taking up a relatively large part of the wafer, correct? Essentially, four Max chips glued together with a next generation UltraFusion? Over at the other place, deconstruct60 keeps going on about how Apple is stubbornly using "chunky chiplets", and insists that they'll eventually have to go the AMD route. (He seems rather miffed about it.) I'm curious about your thoughts on that, both short and long-term.
 
From what I gather, you're talking about another big ass die (using a technical term) taking up a relatively large part of the wafer, correct? Essentially, four Max chips glued together with a next generation UltraFusion? Over at the other place, deconstruct60 keeps going on about how Apple is stubbornly using "chunky chiplets", and insists that they'll eventually have to go the AMD route. (He seems rather miffed about it.) I'm curious about your thoughts on that, both short and long-term.

Moving GPUs onto their own chiplet would have certain advantages, mostly in terms of flexibility of product offerings, but it’s more of a marketing trade off than a technical one. Apple doesn’t have to hit as many product categories as AMD. But if they want to sell a variation with a lot of GPU cores and not have to also have a ton of CPU cores, for example, partitioning the die differently would enable them to do that. Same deal with neural engines, etc.

The up-side of Apple’s way of doing things is that they can be more assured that they have balanced bandwidth vs. compute resources properly. In any event, there isn’t an inherently right answer, and it wouldn’t surprise me if, someday, Apple does have a GPU/neural-only die (perhaps tiled via ultra fusion with regular CPU/GPU die).
 
Moving GPUs onto their own chiplet would have certain advantages, mostly in terms of flexibility of product offerings, but it’s more of a marketing trade off than a technical one. Apple doesn’t have to hit as many product categories as AMD. But if they want to sell a variation with a lot of GPU cores and not have to also have a ton of CPU cores, for example, partitioning the die differently would enable them to do that. Same deal with neural engines, etc.

The up-side of Apple’s way of doing things is that they can be more assured that they have balanced bandwidth vs. compute resources properly. In any event, there isn’t an inherently right answer, and it wouldn’t surprise me if, someday, Apple does have a GPU/neural-only die (perhaps tiled via ultra fusion with regular CPU/GPU die).

The asymmetrical SoC configuration I have talked about previously, and GPU-specific SoCs could also lead to ASi GPGPUs; send compute/render jobs to the AS GPGPUs while still working full speed on the asymmetrical M3 Extreme SoC...!
 
The asymmetrical SoC configuration I have talked about previously, and GPU-specific SoCs could also lead to ASi GPGPUs; send compute/render jobs to the AS GPGPUs while still working full speed on the asymmetrical M3 Extreme SoC...!
I wish somebody loved me as much as @B01L loves GPGPUs...
 
The development costs wouldn’t be very high, and most of the changes to the SoC would benefit ultras and max’s too (things like increasing the off-chip bandwidth). There would also be some small portion of the Max that wouldn’t be used for anything (double the portion there is now) when those off-chip drivers aren’t used. So the cost to manufacture the Max is a little higher than it otherwise would be. Development-wise, since M3 is a new design, it has little impact; you are re-floor planning, anyway, so you just factor the necessary changes in from the start.

The main issue would be yield for the interposer and the fabric, so it’s more of a manufacturing cost (which will be reflected in the price).
What might the topology look like with >2 dies and UltraFusion?

I’m struggling to imagine it because I’m stuck in the “chiplet” mindset 😅
 
I wish somebody loved me as much as @B01L loves GPGPUs...

If we cannot get third-party GPUs in an ASi Mac Pro, nor ASi GPUs in an ASi Mac Pro (as part of the overall GPU core & RAM pool)...

Then it only makes sense that an ASi GPGPU (no display output, pure compute/render target) would be the best solution for more GPU horsepower in the ASi Mac Pro, unless you have a better solution...?
 
What might the topology look like with >2 dies and UltraFusion?

I’m struggling to imagine it because I’m stuck in the “chiplet” mindset 😅

Tough to say. Depends on whether Apple is willing to make two versions of max (probably not). If I had to do it, my first thought would be to arrange them in a 2x2 grid. Just like now, you can talk to one neighbor easily using the drivers that are on the end of the M1 Max that are not duplicated on the other end. So you have two pairs of die, just like two Ultras side-by-side. Then you need some way to talk from one pair to the other. For that I’d probably do something like this, where I’ve move the two “pro” sections apart a bit and put in a highway to bring the buses out to the sides:

1686758931902.png


You would have some way to limit the load on the bus to cut off the half die-width of metal that you aren’t using. (Probably this simply involves disabling power to the buffers that drive the wires to the half of the chip you aren’t using). In theory this would allow very big arrays of die. The interposer (“ultrafusion”) becomes more complicated, both because it needs to spread the wires around, and because you may need big buffers on there.

Keep in mind, too, that while I’ve drawn it as spreading the two Pro sections apart, that may not be necessary. If you have enough metal layers and know what you’re doing re: shielding, the highway could just run over the top of everything (and you could do some of the spreading that way).
 
Just for giggles, I decided to compare my Intel Mac mini to the M2 Ultra.

Here are the Geekbench 6.1 scores for the M2 Ultra Mac Studio featuring 24 CPU, 76 GPU, and 128GB of RAM.

First, CPU:

M2UltraCPU.png


Metal:

M2UltraMetal.png


Now, the specs for my 2018 Intel Mac mini: 4-core i3-8100B (no Hyper-Threading), 64GB RAM, Black Magic RX 580 eGPU.

CPU:

MacminiCPU.jpg


Metal:

MacminiMetal.jpg


When I purchased it back in 2018, it was supposed to be a stopgap configuration, until the switch to Arm was sorted out. I'm now on year five of the two years I expected to own this Mac mini. I don't know exactly what I'm going to upgrade to, likely M3 generation, perhaps I could push it to M4. What I do know is that I'll receive a massive boost in performance and features whenever I do finally replace it.
 
More WWDC tidbits
Updates to CoreML for StableDiffusion. Seems like a good improvement? I'd be interested in more knowledgeable people's input on how significant it is. How does the compare to high end Nvidia gpus?

Device--compute-unit--attention-implementationEnd-to-End Latency (s)Diffusion Speed (iter/s)
iPhone 12 MiniCPU_AND_NESPLIT_EINSUM_V2201.3
iPhone 12 Pro MaxCPU_AND_NESPLIT_EINSUM_V2171.4
iPhone 13CPU_AND_NESPLIT_EINSUM_V2151.7
iPhone 13 Pro MaxCPU_AND_NESPLIT_EINSUM_V2121.8
iPhone 14CPU_AND_NESPLIT_EINSUM_V2131.8
iPhone 14 Pro MaxCPU_AND_NESPLIT_EINSUM_V292.3
iPad Pro (M1)CPU_AND_NESPLIT_EINSUM_V2112.1
iPad Pro (M2)CPU_AND_NESPLIT_EINSUM_V282.9
Mac Studio (M1 Ultra)CPU_AND_GPUORIGINAL46.3
Mac Studio (M2 Ultra)CPU_AND_GPUORIGINAL37.6

https://twitter.com/atiorh/status/1669009762959368192
https://twitter.com/atiorh/status/1669009764091838464
 
Last edited by a moderator:
The first reports concerning the fan noise coming from the M2 Mac Studio are being discussed by actual users.

studionoise.jpg


Essentially, Apple took the complaints about fan noise seriously, and have adjusted the M2 variant accordingly. Keep in mind that the Mac Studio is marketed toward content producers, including recording studios, so it needs to be quiet. Most folks were probably fine with the M1 Mac Studio, but not all customers were, hence Apple adjusted the cooling solution. As a result, the M2 Mac Studio is now the quietest Mac that can be purchased, aside from the MacBook Air, which has no fans.
 
Moving GPUs onto their own chiplet would have certain advantages, mostly in terms of flexibility of product offerings, but it’s more of a marketing trade off than a technical one. Apple doesn’t have to hit as many product categories as AMD. But if they want to sell a variation with a lot of GPU cores and not have to also have a ton of CPU cores, for example, partitioning the die differently would enable them to do that. Same deal with neural engines, etc.

The up-side of Apple’s way of doing things is that they can be more assured that they have balanced bandwidth vs. compute resources properly. In any event, there isn’t an inherently right answer, and it wouldn’t surprise me if, someday, Apple does have a GPU/neural-only die (perhaps tiled via ultra fusion with regular CPU/GPU die).

Apple has a patent discussing arranging two dies in a space saving 2.5D configuration, which is also affordable in manufacturing. The way I read this patent is that one die will contain logic and be manufactured on a smaller node, while the other die will contain memory (cache/controllers) and be manufactured on an older node. This approach could help them maximise the capabilities of smaller nodes and increase the compute density.

But I don't think we will be seeing dedicated GPU dies from Apple any time soon, it just doesn't seem compatible with their mobile-first vision...
 
But I don't think we will be seeing dedicated GPU dies from Apple any time soon, it just doesn't seem compatible with their mobile-first vision...

Mobile-first includes the Mn Max with an UltraFusion interconnect, which goes unused in all the MacBook Pro laptops & Mac Studio headless desktops it is placed within...

The UltraFusion interconnect only comes into play with the Mn Ultra Mac Studios & Mac Pros...

So four products with the UltraFusion, but only two taking advantage of it...

A GPU-specific die, designed to work in conjunction with a normal Mn Max die, would be used in Mn Ultra Mac Studios, Mac Pros, and a possible ASi GPGPU add-in card; so three possible products using a GPU-specific die...

And if Apple takes this possible ASi GPGPU and slaps it into a TB5 chassis, eGPGPU for everything not on the previous list...!

That would be ALL the laptops, All the Mac minis, and ALL the Mac Studios; oh, and the 24" iMac...! ;^p
 
Mobile-first includes the Mn Max with an UltraFusion interconnect, which goes unused in all the MacBook Pro laptops & Mac Studio headless desktops it is placed within...

The UltraFusion interconnect only comes into play with the Mn Ultra Mac Studios & Mac Pros...

So four products with the UltraFusion, but only two taking advantage of it...

A GPU-specific die, designed to work in conjunction with a normal Mn Max die, would be used in Mn Ultra Mac Studios, Mac Pros, and a possible ASi GPGPU add-in card; so three possible products using a GPU-specific die...

And if Apple takes this possible ASi GPGPU and slaps it into a TB5 chassis, eGPGPU for everything not on the previous list...!

That would be ALL the laptops, All the Mac minis, and ALL the Mac Studios; oh, and the 24" iMac...! ;^p

I am not sure your math adds up :) Yes, UltraFusion is a cost that has to be paid for every single Max die produced, but it's not like a dedicated GPU die would be any different. You still need to provision some sort of high-bandwidth interface on a smaller die to connect to the GPU die. And the GPU die itself costs extra tape-outs and production resources, which are already precious. The beauty of the Max approach is its reusability, one can adapt to the market needs and direct the production resources to where the demand is.

I also don't think that a GPU die is a necessity to enable truly exceptional products. Die area and manufacturing capability are probably the biggest problems, and poor scaling of SRAM with node size improvements means that compute becomes more and more expensive. But if one splits the SoC functionality across multiply stacked dies, manufactured using properly optimised process, one can maximise process utilisation. Right now, probably less than 60% of the M2 Max die is used for compute. Imagine if one could increase it to 90%, moving the supporting functionality onto a separate die (still on the same SoC). This could mean dramatic improvement in performance with only negligible increase in cost. This is what all this stuff is about:

1686855571734.png



As to the rest, I do hope we will see compute modules one day, even if that's just for the Mac Pro...
 
Regarding hardware ray tracing, when do we think we would have confirmation? Would the A17 on the iPhone be first to have it, or would we need to wait for M3?
 
Back
Top