No “Extreme” chip coming to Mac Pro?

Btw, couldn't this be done in software instead? Use in-package memory as the "traditional" RAM, then use DIMM slots as a first level super-fast storage for virtual memory (instead of going down to SSD storage when in-package RAM is not enough). An additional software layer would be needed to address page faults in DIMM slots (when both the in-package and DIMM RAMs are full) and fall back to disk then, but I guess that's cheaper to implement than designing hardware to turn the in-package RAM as a cache.
And you'd be sort-of combining the capacities of both RAMs too.

Interesting. So basically use RAM as swap memory?

I suppose this could be done but the performance overhead would be substantial... a hardware tiered RAM solution would have much much lower latency.
 
Interesting. So basically use RAM as swap memory?

I suppose this could be done but the performance overhead would be substantial... a hardware tiered RAM solution would have much much lower latency.
Yeah, that's basically what I was thinking. There are other implications I can think of besides latency: every time a page fault was encountered for in-package RAM it'd be moving in and out entire virtual memory pages (16KiB on Apple Silicon I think) instead of single cache lines (128 bytes?), which could have bandwidth implications, particularly for access patterns with bad locality where —worst case scenerio— it could be fetching a whole page for a single variable.

So obviously worse performance-wise. Not sure by how much though, and it ought to be cheaper to implement than in-package RAM as a cache + DIMMs as true system RAM. But maybe bad enough to not be worth the price of the DIMM modules.
 
The handles!


A 4-die M2 Ultra would already have up to 256GB of on-package RAM. Not only would they be creating a sort-of cache for the package memory for Mac Pro users only (which is already a small customer base), it would only benefit Mac Pro users who need more than 256GB of RAM installed. Which I imagine an even smaller customer base.

Btw, couldn't this be done in software instead? Use in-package memory as the "traditional" RAM, then use DIMM slots as a first level super-fast storage for virtual memory (instead of going down to SSD storage when in-package RAM is not enough). An additional software layer would be needed to address page faults in DIMM slots (when both the in-package and DIMM RAMs are full) and fall back to disk then, but I guess that's cheaper to implement than designing hardware to turn the in-package RAM as a cache.
And you'd be sort-of combining the capacities of both RAMs too.


Baked shadows are another good example of pre-computed rendering properties.
Sure, they could use the DIMMs as sort of a RAM Disk for VM, but the software overhead would make it even slower.
 
A 4-die M2 Ultra would already have up to 256GB of on-package RAM. Not only would they be creating a sort-of cache for the package memory for Mac Pro users only (which is already a small customer base), it would only benefit Mac Pro users who need more than 256GB of RAM installed. Which I imagine an even smaller customer base.
Even more—based on the M2's 12 GB RAM modules, a 2 x M2 Ultra should max out at 384 GB RAM on-package.

But is that a lot of RAM for a machine with 32 performance cores? If you're runnning, say, 32 simultaneous ST programs, that's just 12 GB/P-core. Not sure what the RAM requirements are for MT programs.
 
Last edited:
Even more—based on the M2's 12 GB RAM modules, a 2 x M2 Ultra should max out at 384 GB RAM on-package.

But is that a lot of RAM for a machine with 32 performance cores? If you're runnning, say, 32 simultaneous ST programs, that's just 12 GB/P-core. Not sure what the RAM requirements are for MT programs.

It is not just the CPU cores using that 384GB of RAM, the GPU cores also use the same pool of RAM, what with the whole Unified Memory Architecture dealio...
 
Last edited:
It is not just the CPU sores using that 384GB of RAM, the GPU cores also use the same pool of RAM, what with the whole Unified Memory Architecture dealio...
There's some degree of duplicated data with GPUs with dedicated RAM though. It's not unusual to end up with a copy of a certain resource (like a buffer) in both system RAM and GPU VRAM. There's also some degree of memory savings with Apple GPUs able to use memoryless textures (intermediate textures that are generated in the middle of render passes, like the G-Buffer, can live temporarily on tile memory rather than in system RAM). This can be a noticeable on mobile devices where memory is scarce, although on a system with 384GB it's probably a drop in the ocean.
 
There is a pretty interesting discussion of the Extreme on this week's ATP podcast. As here, the hosts have no idea what Apple's plan is for the Mac Pro, but there is some sadness that one of the hosts 2019 Mac Pro (with mid range gpus) has better compute perf than the latest Ultra. Some discussion about whether Apple might allow AMD gpus again in the Mac Pro (which I feel is very unlikely). I hadn't realised that the latest AMD 7900 gpu is around 3x the ultras performance. I have no idea how Apple would bridge that gap, even with the mythical Extreme, which is sad for me personally.
 
There is a pretty interesting discussion of the Extreme on this week's ATP podcast. As here, the hosts have no idea what Apple's plan is for the Mac Pro, but there is some sadness that one of the hosts 2019 Mac Pro (with mid range gpus) has better compute perf than the latest Ultra. Some discussion about whether Apple might allow AMD gpus again in the Mac Pro (which I feel is very unlikely). I hadn't realised that the latest AMD 7900 gpu is around 3x the ultras performance. I have no idea how Apple would bridge that gap, even with the mythical Extreme, which is sad for me personally.

None of these things are surprising. M1 GPU (and by extension M2) are still pretty much mobile GPUs, just scaled to ridiculous proportions. They are very good when you consider their perf per watt (Ultra does 20 TFLOPS with 80 watts, which is quite insane), but the current generation is limited when absolute performance is considered. Apple needs to make things bigger, faster and more capable overall if they want to compete in the desktop space. Ulta has 8k "compute units" which is quite respectable, but they are operated at just 1.26ghz (compare it to 2+ghz on AMD)

P.S. Do take all these numbers with a grain of salt. The main reason why RDNA3 offers such a massive performance boost is because AMD has doubled the number of compute ALUs and allowed two operations to be executed at once. However, the delivery is in details. There is a new instruction format that packs two operations into one instruction (VLIW-style), but there are some limitations regarding register and data use, so hardware utilisation efficiency is less than optimal. So the actual improvement you get in compute workflows is much less than 100% unlike the TFLOPS values might suggest.
 
Regarding Apple's GPU compute performance, do they need to match AMD or nVidia's TFLOPS numbers to achieve the same performance?

For rasterisation, as Apple's GPU uses TBDR, it should drastically cut down on the amount of computation required?
 
Regarding Apple's GPU compute performance, do they need to match AMD or nVidia's TFLOPS numbers to achieve the same performance?

TLOPS is tricky, as it refers to peak performance under very constrained scenarios, but it's generally a good proxy of what you can expect. Recent AMD GPUs made this more complicated by introducing elements of LVIW back into the architecture, no idea if Nvidia operates similarly or whether they have a more sophisticated instruction scheduler. Things like dual-issue and other circumstantial optimisations do make these judgements more difficult as peak TFLOPS is inflated.

But generally, yes, a GPU with peak 20TFLOPS throughput won't be able to match a 40TFLOPS GPU averaged over various compute workloads. Of course, there are many other factors like register file size, memory access patterns and of course the nature of the program itself.

One of the patents I have linked earlier appears to describe a SMT-like execution model for the GPU, where two programs are executed simultaneously and the computational resources are shared to increase hardware utilisation without a context switch. We will probably see this in the next generation of Apple GPUs, could probably boost performance by 10% or so.

For rasterisation, as Apple's GPU uses TBDR, it should drastically cut down on the amount of computation required?

Generally, yes, but again it depends... But this is something we can see in rasterisation benchmarks, where Apple tends to perform fairly well. It's just a shame that games don't use tile shaders...
 
Something seems to have gone wrong with the packaging technology, if true.
It's worth noting that, even now, Apple continues to have trouble shipping the Studio Ultra. In the US, all of Apple's other Macs are either available now or, at worst, within 3-5 days. The Studio Ultra, at 2–3 weeks, is the one remaining product I found with significant delays (it used to be 2-3 months, and the reduction may be due to decreased demand—at this point I'd guess most who wanted them got them, and those who haven't may be waiting for the M2 Studio to come out in the spring). I wonder if this means they're still having problems with the packaging tech needed to fuse two Max chips, in which case it's not surprising that fusing four would be giving them trouble.

Edit: Correction, one of the four Ultra GPU/RAM configurations (max GPU/min RAM) is only 3-5 days. I missed that the first time I looked. So much for my theory...
 
Last edited:
It's worth noting that, even now, Apple continues to have trouble shipping the Studio Ultra. In the US, all of Apple's other Macs are either available now or, at worst, within 3-5 days. The Studio Ultra, at 2–3 weeks, is the one remaining product I found with significant delays (it used to be 2-3 months, and the reduction may be due to decreased demand—at this point I'd guess most who wanted them got them, and those who haven't may be waiting for the M2 Studio to come out in the spring). I wonder if this means they're still having problems with the packaging tech needed to fuse two Max chips, in which case it's not surprising that fusing four would be giving them trouble.
Do they custom build that configuration? That might be the reason for a 2-3 week delay
 
Do they custom build that configuration? That might be the reason for a 2-3 week delay
It looks like all the Max's and Ultra's are BTO's, since none are listed as "in stock", but all the Max's are 3-5 days. However, I just noticed one of the four Ultra GPU/RAM configs (max GPU/min RAM) is 3–5 days as well, so there goes my theory....
 
Last edited:
It looks like all the Max's and Ultra's are BTO's, since none are listed as "in stock", but all the Max's are 3-5 days. But I just noticed one of the four Ultra configs (max GPU, min RAM) is 3–5 days as well, so there goes my theory....

I just checked Ultra stock at B&H. Looks like there are a lot of 1 TB (and a couple 2TB) storage configurations in stock. And a few 4 and 8TB configurations. I suspect when B&H makes an Apple purchase the quantities a large.
 
Last edited:
This just occurred to me: Thus far, when introducing an AS replacement for a given Intel Mac, Apple has *always* shown performance comparisons between the former and the top-end configuration of the latter. How is Apple going to handle a comparison of an M2 Ultra AS Mac Pro to a dual-W6900X Intel Mac Pro? The latter should have significantly higher GPU performance.

I suppose they could instead compare the M2 Ultra Mac Pro to an equal-dollars version of the the Intel Mac Pro, in which case they could use one with a much lower-end GPU (e.g., a single W6600X).
 
Last edited:
This just occurred to me: Thus far, when introducing an AS replacement for a given Intel Mac, Apple has *always* shown performance comparisons between the former and the top-end configuration of the latter. How is Apple going to handle a comparison of an M2 Ultra AS Mac Pro to a dual-W6900X Intel Mac Pro? The latter should have significantly higher GPU performance.

I suppose they could instead compare the M2 Ultra Mac Pro to an equal-dollars version of the the Intel Mac Pro, in which case they could use one with a much lower-end GPU (e.g., a single W6600X).
My guess is that Apple will compare it to "our most popular model" of Mac Pro, as @Andropov said, which I believe is the 5700X. I wish I had the quote, but I think John Ternus said something along those lines. Or, they may simply compare it to the default Intel configuration, the 5500X. There's also the possibility that they'll ignore the old Intel models, since that is clearly the past, and they want to concentrate on the future. I'm not too concerned about how Apple spins it, but the actual product, which is likely the case for customers who would actually purchase such a creature. These are the least "lifestyle company" (as Gelsinger deridingly said) Apple aficionados.

This is all assuming that Apple doesn't have third-party GPU support. However, that would be equally awkward, because AMD won't have professional RDNA3 cards out until the second half of this year, which would be a lateral move, if Apple were stuck with the same old MPX modules. Again, that's assuming Apple doesn't delay until well after WWDC, and I think they want to finish the switch as soon as possible.

There are many unanswered questions. I have a feeling it'll be much more straightforward than many expect, but I could certainly be wrong about that.
 
My guess is that Apple will compare it to "our most popular model" of Mac Pro, as @Andropov said, which I believe is the 5700X. I wish I had the quote, but I think John Ternus said something along those lines. Or, they may simply compare it to the default Intel configuration, the 5500X. There's also the possibility that they'll ignore the old Intel models, since that is clearly the past, and they want to concentrate on the future. I'm not too concerned about how Apple spins it, but the actual product, which is likely the case for customers who would actually purchase such a creature. These are the least "lifestyle company" (as Gelsinger deridingly said) Apple aficionados.

This is all assuming that Apple doesn't have third-party GPU support. However, that would be equally awkward, because AMD won't have professional RDNA3 cards out until the second half of this year, which would be a lateral move, if Apple were stuck with the same old MPX modules. Again, that's assuming Apple doesn't delay until well after WWDC, and I think they want to finish the switch as soon as possible.

There are many unanswered questions. I have a feeling it'll be much more straightforward than many expect, but I could certainly be wrong about that.
But the 5700X/5500W GPU is popular because that is the cheapest one. 90% of Mac Pro purchasers just buy the baseline GPU and upgrade it later to a much more powerful one as it is much cheaper than what Apple charges them.


*The 90% take is my take but businesses may just pay whatever Apple asks of them. But most users/individual buyers do not buy the high end GPUs from Apple because buying from Microcenter is cheaper.
 
My guess is that Apple will compare it to "our most popular model" of Mac Pro, as @Andropov said, which I believe is the 5700X. I wish I had the quote, but I think John Ternus said something along those lines. Or, they may simply compare it to the default Intel configuration, the 5500X. There's also the possibility that they'll ignore the old Intel models, since that is clearly the past, and they want to concentrate on the future. I'm not too concerned about how Apple spins it, but the actual product, which is likely the case for customers who would actually purchase such a creature. These are the least "lifestyle company" (as Gelsinger deridingly said) Apple aficionados.

This is all assuming that Apple doesn't have third-party GPU support. However, that would be equally awkward, because AMD won't have professional RDNA3 cards out until the second half of this year, which would be a lateral move, if Apple were stuck with the same old MPX modules. Again, that's assuming Apple doesn't delay until well after WWDC, and I think they want to finish the switch as soon as possible.

There are many unanswered questions. I have a feeling it'll be much more straightforward than many expect, but I could certainly be wrong about that.

It would certainly be interesting to get an official Apple statement on what the most popular OEM module is.

But the 5700X/5500W GPU is popular because that is the cheapest one. 90% of Mac Pro purchasers just buy the baseline GPU and upgrade it later to a much more powerful one as it is much cheaper than what Apple charges them.


*The 90% take is my take but businesses may just pay whatever Apple asks of them. But most users/individual buyers do not buy the high end GPUs from Apple because buying from Microcenter is cheaper.
That could certainly be the case with RAM, particularly for creatives who have small shops and are thus price-sensitive. But for GPU upgrades, doesn't the Intel Mac Pro require them to be in MPX modules and, if so, are they really that much less expensive aftermarket than equipping them at the time of purchase?

Or once you have the MPX module for a lower-end GPU, can you just swap out its GPU for a higher-end non-MPX model (re-using the MPX container)?
 
Last edited:
That could certainly be the case with RAM, particularly for creatives who have small shops and are thus price-sensitive. But for GPU upgrades, doesn't the Intel Mac Pro require them to be in MPX modules and, if so, are they really that much less expensive aftermarket than equipping them at the time of purchase?
You can stick any PC graphics card into a 2019 Mac Pro and it'll work. During my short stint with one, I tried out a 6800XT from Sonnet, and a 6900XT from Gigabyte, which worked just fine. No MPX needed. Yes, they are significantly cheaper than Apple's prices. The W6600X is about the same price as the 6900XT, at that period in time.

But the 5700X/5500W GPU is popular because that is the cheapest one. 90% of Mac Pro purchasers just buy the baseline GPU and upgrade it later to a much more powerful one as it is much cheaper than what Apple charges them.
This capability is going away. Third-party graphics cards require UEFI to boot. That's missing on Apple Silicon Macs. So, unless somebody specifically makes an aftermarket version that works with iBoot, then Mac Pro owners would have to stick with whatever Apple offers.

This is another reason that I am highly skeptical that Apple will release third-party graphics drivers for the Apple Silicon Mac Pro. The cards would probably be exclusive to Apple itself, and exactly how many Mac Pro owners need to upgrade their GPU beyond what the M2 Ultra will already offer?
 
Back
Top