Mac Pro - no expandable memory per Gurman

Hence my theory that at some point they may partition things differently, so that the GPU die are separate from the CPU die, etc.

If they can get that working, such that capabilities can be deployed like Lego bricks, that’ll be huge for the whole product lineup. It’ll allow for far finer control of the various tiers, heck more tiers of products - almost bringing the flexibility of discrete components with the advantages of a SOC.

Of course that just a dream as of now, not sure when the packaging technology will allow that to happen. Ultra fusion and other current/upcoming AMD/Intel processors with advanced packaging are an exciting start but that’s another level. I’d like to see it by M3, but I may be being optimistic.
 
I am imagining that they might be working out some sort of SoC/Mbd bridge architecture that would allow multiple SoCs to work together, affording each its own UMA while allowing them to reach each other's memory across the bridge. That would support the dGPU advantage (the bridge would isolate each SoC's memory when another SoC is not trying to reach it) with maybe a slightly lower cost of access.

In other words, instead of the M2 Extreme, I envision a Mac Pro multi-chip layout. In a 2U with PCI wing (horizontal) slots that would make it a very bad idea to try to wedge a normal 4090-type card in there, with its big fans.
 
I am imagining that they might be working out some sort of SoC/Mbd bridge architecture that would allow multiple SoCs to work together, affording each its own UMA while allowing them to reach each other's memory across the bridge. That would support the dGPU advantage (the bridge would isolate each SoC's memory when another SoC is not trying to reach it) with maybe a slightly lower cost of access.

In other words, instead of the M2 Extreme, I envision a Mac Pro multi-chip layout. In a 2U with PCI wing (horizontal) slots that would make it a very bad idea to try to wedge a normal 4090-type card in there, with its big fans.

That’s another way they could go. Question is whether they want a fixed ratio between cores/gpu cores/memory or they want to be able to be more flexible.

The current architecture where memory is in the package makes your proposal a bit complicated just because of the off-package bandwidth you’d need (if you want to leverage the “apple silicon way” of dealing with shared memory).
 
Now Gurman (who blocked me on twitter, by the way) is bravely predicting that the new Mac Pro will run a new point version of MacOS.

I suppose the point is about potential release timelines, but even then I’d have to agree that this is an example of “a good gig if you can get it”.
 
The current architecture where memory is in the package makes your proposal a bit complicated just because of the off-package bandwidth you’d need

They could just make it high-speed serial dma interconnect. There is a closet somewhere in the catacombs of 1♾Loop that has a dusty box full of stuff – the label on the box says "XGrid". They can drag that out and reveal its true shining potential.
 
I’m not sure upset is the correct word. In the grand scheme of things, a computer gpu is very unimportant. Having said that, I greatly prefer macOS and desktops to other platforms, and in recent years I have found Mac desktops to be pretty uninspired and what’s worse I can’t recall a second version of any of them. Trash can- one version. iMac Pro - one version. Studio - one so far and rumors there won’t be a second. 2019 Mac Pro - maybe a second edition, but nothing yet, over three years later.

How is this acceptable for any pro desktop user?

You say I’m obsessed with tactics and strategy, but I’m merely trying to find a strategy. Most pros would have been happy with a tower and replaceable components like pc manufacturers make. it feels like Apple hates that approach because it makes it easy to compare the cost with pcs. Instead they want custom components as a means of increasing margins. I’m fine with that. If the Mac Pro with their soc is the future of high end macs, I’d be happy. Only they seemingly can’t iterate successfully with this approach. They were unable to complete one full cycle. So we’re left with a situation where they don’t want to make a computer that pros prefer - tower with replaceable components, and they can’t iterate on their preferred approach.
Trash can got one version because it was a terrible mistake from the word go. As they admitted in 2017, they made a huge and bad bet on the future of how people would use GPUs by designing it entirely around two GPUs running at low frequency to save power. Everyone else in the world gravitated towards big power-hungry single GPUs instead, including application software authors, leaving Apple with a system design which made no sense and didn't have a good upgrade path. It had other major flaws too, mostly no internal PCIe expansion (which is not just about GPUs - pro audio and video editors often need lots of interface cards for external devices).

iMac Pro got one version because it was never intended to be a long term product. In 2017, they needed to announce their big course correction, but also knew they couldn't roll out the new tower any time soon. They needed to ship something rather than nothing. As they admitted in the press event, lots of pro users had begun buying high end consumer 27" iMacs, since for many purposes they were becoming better and faster machines than the aging dead-end trashcan. So they made a pro iMac based on workstation silicon. That product did its job by shipping quickly and giving some customers something to buy in the short term, but there was lots of foreshadowing that Apple didn't think it was the best way to go in the long term.

The studio replaced iMac Pro, so presumably that's what Apple actually thinks that product category (a compact, non-expandable desktop workstation) should look like. It isn't even a year old and you're putting it in the grave?

The 2019 Mac Pro has only one version so far for a very different reason. It too is clearly part of Apple's post-2017 vision for the Mac workstation lineup. 2021 is the earliest Apple could have refreshed it, as that's when Intel shipped a newer generation of Xeon W than the 2019 generation Apple put in the 2019 model (Intel doesn't usually refresh Xeon product lines every year).

However, the very last time Apple did a major refresh cycle on any Intel Mac appears to have been the 2020 27" iMac, in August 2020, a few months before the first Apple Silicon Macs launched. Since then, Apple has done nothing with Intel Macs other than deleting models from the product lineup as they get replaced with Apple Silicon equivalents. It's pretty easy to guess why.

So I just don't see it. I think you're overinterpreting events which have much simpler explanations.
 
Trash can got one version because it was a terrible mistake from the word go. As they admitted in 2017, they made a huge and bad bet on the future of how people would use GPUs by designing it entirely around two GPUs running at low frequency to save power. Everyone else in the world gravitated towards big power-hungry single GPUs instead, including application software authors, leaving Apple with a system design which made no sense and didn't have a good upgrade path.
Not only GPUs, CPUs have also gotten more power hungry since 2013. And it happened not only to the 2013 Mac Pro, but also to the post-2015 MacBook Pro line. Apple switched to a battery with less than 100 Wh capacity, and a thinner body that wasn't as good at dissipating power as the previous MacBooks, but the upcoming processors didn't use any less power (quite the opposite).
 
I’ve not really dived into this, but Apple wants the memory fixed to the motherboard for its non-laptops? What’s wrong with the current clipped in memory standard? Upfront, I don’t like the idea.
 
I’ve not really dived into this, but Apple wants the memory fixed to the motherboard for its non-laptops? What’s wrong with the current clipped in memory standard? Upfront, I don’t like the idea.
The M1 Ultra has 32 LPDDR5 memory channels to achieve the 800GB/s memory bandwidth. To achieve that using DIMMs (one channel per DIMM) you'd need 32 same-capacity DIMMs. For a "M1 Extreme", you'd need 64 of them. Not exactly practical. The last Mac Pro supports up to 8 channels (204GB/s theoretically), which is not enough for a professional SoC with CPU and GPU (GPUs need a lot of bandwidth). DIMMs also have higher latency than in-package memory.

I think the latest Xeons (Sapphire Rapids) have in-package HBM memory + external DIMMs, but I found no technical details on how it works, other than Intel claiming that it doesn't need any code changes.
 
The studio replaced iMac Pro, so presumably that's what Apple actually thinks that product category (a compact, non-expandable desktop workstation) should look like. It isn't even a year old and you're putting it in the grave?

In many ways the studio also replaced the trash can Mac Pro - or at least is the successor to the design. It’s just, appropriately, no longer the top tier of the Mac lineup. Well I suppose it currently is of the Apple Silicon lineup, but hopefully soon it won’t be and hopefully the rumors of the extreme’s demise are not true.

And of course we might still get an iMac Pro or at least a higher end iMac at some point. The rumor mill is unclear on this point.
 
Last edited:
While I've been pessimistic on Apple recently, I only do so because I care about the Mac as a platform, particularly on the desktop. That being said, I am pleased to say that the Mac is weathering the global recession, unlike their competitors, being the only traditional computer maker to increase shipments last quarter. According to IDC, Apple grew globally by 2.5%, while the entire PC market contracted by -16.5%. Assuming the Mac holds the course, Apple should be over 10% marketshare this year.

pcgrowth.jpg


I guess Asus can pat themselves on the back for being the only PC company that didn't decline by double digits.
 
While I've been pessimistic on Apple recently, I only do so because I care about the Mac as a platform, particularly on the desktop. That being said, I am pleased to say that the Mac is weathering the global recession, unlike their competitors, being the only traditional computer maker to increase shipments last quarter. According to IDC, Apple grew globally by 2.5%, while the entire PC market contracted by -16.5%. Assuming the Mac holds the course, Apple should be over 10% marketshare this year.

View attachment 20810

I guess Asus can pat themselves on the back for being the only PC company that didn't decline by double digits.

Keep in mind that IDC is never even close to right about apple products, because it has no visibility into sales made by Apple, itself. The analysts who generally get pretty close to predicting apple’s quarterlies are saying that the Mac likely did better than what IDC says.
 
I’ve not really dived into this, but Apple wants the memory fixed to the motherboard for its non-laptops? What’s wrong with the current clipped in memory standard?

The memory is not soldered to the motherboard (like you would get with a Raspberry Pi), it is literally soldered directly into the SoC chip package. This makes the data signal path much shorter (faster and more energy efficient) than a DIMM rack with 30~50mm leads to the CPU. At 3GHz+, the multi-hunded-picosecond travel time for memory data can make a significant difference. With on-package RAM, cache misses cost a tiny bit less that way – a tiny bit times thousands becomes meaningful.
 
The memory is not soldered to the motherboard (like you would get with a Raspberry Pi), it is literally soldered directly into the SoC chip package. This makes the data signal path much shorter (faster and more energy efficient) than a DIMM rack with 30~50mm leads to the CPU. At 3GHz+, the multi-hunded-picosecond travel time for memory data can make a significant difference. With on-package RAM, cache misses cost a tiny bit less that way – a tiny bit times thousands becomes meaningful.
I don't know how to determine memory performance, so this is probably a naive question. But: Suppose we move away from mobile, and consider desktop applications. Does user-replaceable DDR5 have any performance advantages over on-package LPDDR5 that would make up for the increase in signal path length?
 
I don't know how to determine memory performance, so this is probably a naive question. But: Suppose we move away from mobile, and consider desktop applications. Does user-replaceable DDR5 have any performance advantages over on-package LPDDR5 that would make up for the increase in signal path length?
Hard to say, since you can have each with different speeds. I think that the main structural difference (putting aside voltage differences and such) is that lPDDR5 has 2 16-bit channels and DDR5 has 2 32-bit channels? I could be wrong.
 
Hard to say, since you can have each with different speeds. I think that the main structural difference (putting aside voltage differences and such) is that lPDDR5 has 2 16-bit channels and DDR5 has 2 32-bit channels? I could be wrong.
I was wondering about the bandwidth difference myself, but found myself a bit confused after reading the literature. [I was assuming equal frequencies, though the DDR5 JDEC standard reaches 7600 Hz, while LPDDR5 goes up to 6400 Hz.]
 
I was wondering about the bandwidth difference myself, but found myself a bit confused after reading the literature. [I was assuming equal frequencies, though the DDR5 JDEC standard reaches 7600 Hz, while LPDDR5 goes up to 6400 Hz.]
Seems to me that bandwidth would essentially be a wash at the same clock frequency, though you’d need twice as many die for LPDDR5? But those die may not take up any more space than the DDR5 chips.

In any event, signal flight time increases by approximately 6 picoseconds for every millimeter of distance (assuming a dielectric constant around 2.5-4). That adds up fast. Plus you have a higher capacitive load, so you need much beefier drivers or you have a much slower ramp time on the signals, which also increases the delay. Latency can be more important than bandwidth for many types of workloads.
 
I’ve not really dived into this, but Apple wants the memory fixed to the motherboard for its non-laptops? What’s wrong with the current clipped in memory standard? Upfront, I don’t like the idea.

Slotted RAM is not fast enough. The memory solution Apple uses is similar to GPUs or specialised supercomputers (like existing Fugaku or upcoming Nvidia Grace-based designs). Those don't have replaceable RAM either. To achieve the same level of performance as M1 Max Apple would need to offer 8 memory channels, all of them filled. There are some server-grade CPUs with such a setup (e.g. AMD EPYC), but the mainboard alone costs over $1000 and uses the E-ATX form factor. And something like Ultra would require 16 slots. It's just not feasible for a Mac-Pro like setup, especially if you want high performance.

Slotted RAM in Apple Silicon designs would make sense as part of a tiered memory solution, as an additional large pool of memory living above the fast system RAM (where the fast system RAM works like a cache). But that's not easy to do either and it doesn't seem like Apple has the technology for this at this time.
 
Back
Top