M4 Rumors (requests for).

@theorist9 - I do miss the days of a more affordable Power Mac/Mac Pro...!

But if Apple ever releases another Cube, I will scrimp & save to plop that beauty on my desk, and then run it for the next decade...!
 
And (as you probably know) even with the advent of single-chip microprocessors, NASA sticks to much older systems. AFAIK, NASA and others are still using the radiation-hardened RAD750 CPU for their most extreme applications (it was included in the Webb), which is made on either a 150 nm or 250 nm process—and costs ~$300k/unit!

They stick with it because they know it works, and its ~300 MIPS is sufficient for their rather modest local processing requirements. The large feature size is a benefit, since it reduces the ability of radiation to disrupt or damage the processor.

300 MIPS is sufficient for today's spacecraft only because they have to design said spacecraft around the RAD750's limitations. They'd love to be able to move on to something faster, and in fact NASA has a program under way to do just that. The RAD750 is getting very old, and it's a real limitation.

To give an example... iirc, the JWST is bottlenecked not by its various observational instruments, but by how fast its radios can send collected data to earth. If they had more local compute, they could use better compression algorithms to get more out of the available bandwidth.

Another example... the Ingenuity helicopter drone deployed by the recent Perseverance Mars rover had to fly autonomously, meaning they needed profoundly better onboard compute, weight, and power efficiency than a RAD750. There was literally no rad hardened system which could meet the program's requirements, so they put in an off the shelf Qualcomm Snapdragon 801 smartphone SoC, and lived with the risks. Only possible because the drone was viewed as a sort of bonus mission, and added little or no risk to the main rover mission, but it would've been a total nonstarter with RAD750 compute performance.

The reason things move so slowly comes down to money. It's expensive to design and qualify rad-hard CPUs, even when starting from an existing non-hardened design as the RAD750 did. Same applies to systems. Extra development expenses relative to commercial combined with ultra-low sales volume means there's no economical way to move faster.
 
And it's not just the marketing, it's also the pricing that puts it into workstation territory. The starting price of the 2013 MP was $3,000, which was 1.5x the $2,000 starting price of the 2013 15" MBP.

By comparison, the starting price of the ASi MP is $7,000, which is 2.8x the $2,500 starting price of the 16" MBP.
Not trying to pile on your posts, but IMO you've chosen the wrong comparison here. The successor of the 2013 MP isn't the ASi MP, it's the Mac Studio. Pretty much the same kind of machine - workstation class CPU and GPU performance (actually they did better on both fronts this time around), no internal PCIe expansion, very small form factor, very quiet. And the Studio actually manages to come in at 0.8x the starting price of the current 16" MBP, despite offering significantly better workstation specs (Max chip instead of Pro chip, 32GB RAM rather than 16/18GB).

The MP itself is now just a niche machine for those who require high bandwidth internal PCIe expansion. It's very overpriced. Some of that is no doubt due to its low sales volume, some of it's that Apple overbuilt the chassis to such a ridiculous degree. I do wish they'd make a reduced-cost version with less fancy metalwork, there's no reason it can't be $500 to $1000 more than the Ultra Mac Studio. Sadly, however, I suspect that wouldn't help Apple move more units, so they aren't likely to do it.
 
Not trying to pile on your posts, but IMO you've chosen the wrong comparison here. The successor of the 2013 MP isn't the ASi MP, it's the Mac Studio. Pretty much the same kind of machine - workstation class CPU and GPU performance (actually they did better on both fronts this time around), no internal PCIe expansion, very small form factor, very quiet. And the Studio actually manages to come in at 0.8x the starting price of the current 16" MBP, despite offering significantly better workstation specs (Max chip instead of Pro chip, 32GB RAM rather than 16/18GB).

The MP itself is now just a niche machine for those who require high bandwidth internal PCIe expansion. It's very overpriced. Some of that is no doubt due to its low sales volume, some of it's that Apple overbuilt the chassis to such a ridiculous degree. I do wish they'd make a reduced-cost version with less fancy metalwork, there's no reason it can't be $500 to $1000 more than the Ultra Mac Studio. Sadly, however, I suspect that wouldn't help Apple move more units, so they aren't likely to do it.
If they do a Mac Pro Extreme, then the form factor could be worth it depending on the price structure. If. Otherwise yeah there’s little point to the form factor and if they want to keep a device with any internal expansion at all then you and @B01L are right they might as well move to a smaller, cheaper body. If the development costs of such a body even makes sense.

But @B01L I doubt that they could run both a “Cube” and the current Pro, especially not with the Studio there too. Without an explosion of interest in Mac workstations, it just doesn’t seem like there’s a big enough market for three high end desktop Mac computers. Even the rumored “iMac Pro” looks a far ways away.

My hope is that they’ll keep the Mac Pro and find a way to generate interest in it from some angle - rendering, training, etc … with high VRAM as a major selling point for a much lower price than you’d get on the market today (say what you will about Apple’s base/upgrade price for RAM, price per VRAM is pretty damn cheap). It’s not clear if it would work but that to me is Apple’s clearest hope of success in that market. For some of these use cases, it’s more prosumer than big iron, but that’s okay because that’s what Apple wants to sell and I think there’s a possible market there if Apple hits the right spot. But I’m sure they have much smarter and more knowledgeable people than me looking into that.
 
If they do a Mac Pro Extreme, then the form factor could be worth it depending on the price structure. If. Otherwise yeah there’s little point to the form factor and if they want to keep a device with any internal expansion at all then you and @B01L are right they might as well move to a smaller, cheaper body. If the development costs of such a body even makes sense.

There is definitely room for cost cutting with a less "engineered" chassis design...

But @B01L I doubt that they could run both a “Cube” and the current Pro, especially not with the Studio there too. Without an explosion of interest in Mac workstations, it just doesn’t seem like there’s a big enough market for three high end desktop Mac computers. Even the rumored “iMac Pro” looks a far ways away.

My thoughts for a Mac Pro Cube would be for the Mn Extreme SoC only, for those macOS users who want the pinnacle of ASi horsepower but do not have a need for PCIe slots; Mn Extreme Mac Pro Cube, the ultimate personal workstation...! ;^p
 
Not trying to pile on your posts, but IMO you've chosen the wrong comparison here. The successor of the 2013 MP isn't the ASi MP, it's the Mac Studio. Pretty much the same kind of machine - workstation class CPU and GPU performance (actually they did better on both fronts this time around), no internal PCIe expansion, very small form factor, very quiet. And the Studio actually manages to come in at 0.8x the starting price of the current 16" MBP, despite offering significantly better workstation specs (Max chip instead of Pro chip, 32GB RAM rather than 16/18GB).

The MP itself is now just a niche machine for those who require high bandwidth internal PCIe expansion. It's very overpriced. Some of that is no doubt due to its low sales volume, some of it's that Apple overbuilt the chassis to such a ridiculous degree. I do wish they'd make a reduced-cost version with less fancy metalwork, there's no reason it can't be $500 to $1000 more than the Ultra Mac Studio. Sadly, however, I suspect that wouldn't help Apple move more units, so they aren't likely to do it.
Nope, you've misread my post. My point is precisely that the 2019 MP and ASi MP are *not* sucessors to the 2013 MP (as you said, its ASi sucessor is more properly the Mac Studio). Instead, the 2019 MP was heavily marketed by Apple (and priced by Apple) as a full-fledged workstation, and thus comparisons to PC workstations (and their capabilites) are entirely appropriate. Further, the Apple marketed the ASi MP as a direct successor to the 2019 MP (and, further, likewise gave it workstation pricing), so it's entirely reasonable--based on the way Apple itself has postioned that machines in the market--to compare its performance and capabilities to those of PC workstations.
 
There is definitely room for cost cutting with a less "engineered" chassis design...



My thoughts for a Mac Pro Cube would be for the Mn Extreme SoC only, for those macOS users who want the pinnacle of ASi horsepower but do not have a need for PCIe slots; Mn Extreme Mac Pro Cube, the ultimate personal workstation...! ;^p
Oh absolutely that’d be a great machine - don’t get me wrong I see what you are going for. I just don’t think they could justify having the Studio, your Cube design, and the Pro in the same lineup without as I said a big explosion in interest in the high end Macs. The prevailing wisdom is they don’t just sell enough. That said I wouldn’t holler if they went with your cube design over the current pro it makes more sense right now but they’d have to spend R&D on that. So the safer approach would be to still use the Pro chassis and see if there’s a market for an Extreme chip before investing in changing the chassis.

Nope, you've misread my post. My point is precisely that the 2019 MP and ASi MP are *not* sucessors to the 2013 MP (as you said, its ASi sucessor is more properly the Mac Studio). Instead, the 2019 MP was heavily marketed by Apple (and priced by Apple) as a full-fledged workstation, and thus comparisons to PC workstations (and their capabilites) are entirely appropriate. Further, the Apple marketed the ASi MP as a direct successor to the 2019 MP (and, further, likewise gave it workstation pricing), so it's entirely reasonable--based on the way Apple itself has postioned that machines in the market--to compare its performance and capabilities to those of PC workstations.

I don’t think they need to get to 1.5 TB of RAM right away to carve out a really nice niche for themselves but I agree that they do need more than they have now and a more powerful processor if they want to make the Mac Pro or even @B01L ’s Cube design a viable product. The current model though just doesn’t justify itself even against its own siblings never mind the PC competition. We’re in violent agreement on that.

The rumored Hidra processor with 128/512GB of min/max RAM/VRAM, depending on price and capabilities of course, could be the ticket and if it isn’t? Well … then I doubt adding a terabyte of max RAM would actually help them and Apple would be better off just cutting losses and retreating completely. If it is a success then sure continue bumping up the RAM every generation and keep going! Maybe I’m wrong and they absolutely need that 1.5 TB of RAM right now but I don’t think so. I don’t think that’s the thing to prioritize.

Case in point: consider TinyCorp whose CEO George Hotz 🤮 is currently able to bully AMD into further open sourcing AMD’s software stack and getting Dr Su to respond personally on Twitter. Why? Because they are selling a box of 6 Radeon consumer GPUs for machine learning: total RAM? 96VRAM/128RAM with 768 FP16 TFlops and a low-end EPYC CPU for $15,000. Imagine what Apple could do in this space. 3D Rendering too. I remember when the M1 first came out and was tested, a 3D renderer remarked how great an Apple machine with hundreds of gigabytes of VRAM and ray tracing would be especially combined with their CPUs which were of course impressive.

None of this needs 1.5TB of RAM, they need ray tracing, FP4/8/16 TFLOPs, and well hundreds of gigabytes of RAM/VRAM (oh and the requisite software stack - looks in askance at Apple’s fits and starts in AI 😬). I agree that more RAM is better and if this product direction is a success they should continue pumping it more, but if they are making a Mac Pro exclusive chip and it has ray tracing and training capabilities and it has hundreds of gigabytes of high bandwidth memory (hbm not HBM 🙃) then that’s a compelling device for those currently hot markets even without the additional terabyte of RAM.
 
Last edited:
IMHO, the 2013 Mac Pro, the 2019 Mac Pro, and the Apple Silicon Mac Pro are radically different machines and concepts to the point that it makes comparisons a bit pointless. Something I've not seen mentioned enough though, it's the massive 6-year gap between the 2013 and 2019 Mac Pros. I don't think one can make the argument of whether or not the 2013 Mac Pro was a workstation based solely on the specs and comparing it to the 2019 Mac Pro given that they were released 6 years apart. Had Apple kept releasing regular updates to the trashcan Mac Pro, maybe the actual difference in terms of specs would have been much smaller by the time the 2019 Mac Pro was released, and even while keeping the trashcan form factor I can imagine it slowly developing into becoming a "workstation"-class computer like the 2019 Mac Pro.

More important than the "aspirational" category of each of the Mac Pros is the audience that ended up buying them. I suspect that didn't change much throughout the years. The 2019 Mac Pro with 1.5TB of RAM might be in a different class to the trashcan Mac Pro, but I would be surprised if the bulk of the buyers weren't essentially the same people. So while theoretically they were different kinds of machines, I wonder if they ended up being bought by the same kind of people, and as such the limitation of having access to 1.5TB of RAM or not has any practical impact
 
And (as you probably know) even with the advent of single-chip microprocessors, NASA sticks to much older systems. AFAIK, NASA and others are still using the radiation-hardened RAD750 CPU for their most extreme applications (it was included in the Webb), which is made on either a 150 nm or 250 nm process—and costs ~$300k/unit!
Good old G3 still serving purposes
 
"He may just be guessing, but I wouldn’t be surprised if he’s correct and Gurman’s wrong."

I wouldn't put Gruber and Gurman in the same category.

I my mind, they are "analysis" vs "news"...
 
"He may just be guessing, but I wouldn’t be surprised if he’s correct and Gurman’s wrong."

I wouldn't put Gruber and Gurman in the same category.

I my mind, they are "analysis" vs "news"...
That’s true. I guess my point is that Gurman basis his info from leaks, whereas Gruber tends (or used to) get his information drop Apple directly. It’s similar to Jim Dalrymple. He used to be a semi-official mouthpiece for Apple.
 
300 MIPS is sufficient for today's spacecraft only because they have to design said spacecraft around the RAD750's limitations. They'd love to be able to move on to something faster, and in fact NASA has a program under way to do just that. The RAD750 is getting very old, and it's a real limitation.

To give an example... iirc, the JWST is bottlenecked not by its various observational instruments, but by how fast its radios can send collected data to earth. If they had more local compute, they could use better compression algorithms to get more out of the available bandwidth.

Another example... the Ingenuity helicopter drone deployed by the recent Perseverance Mars rover had to fly autonomously, meaning they needed profoundly better onboard compute, weight, and power efficiency than a RAD750. There was literally no rad hardened system which could meet the program's requirements, so they put in an off the shelf Qualcomm Snapdragon 801 smartphone SoC, and lived with the risks. Only possible because the drone was viewed as a sort of bonus mission, and added little or no risk to the main rover mission, but it would've been a total nonstarter with RAD750 compute performance.

The reason things move so slowly comes down to money. It's expensive to design and qualify rad-hard CPUs, even when starting from an existing non-hardened design as the RAD750 did. Same applies to systems. Extra development expenses relative to commercial combined with ultra-low sales volume means there's no economical way to move faster.
I do recall the story about the helicopter, but I thought that was a one-off, and that that the RAD750 generally did the job (and that most of the computational tasks were offloaded to ground-based systems). But it does make sense that, if they want probes to act more autonomously, then more local processing will be needed.

Are you aware of any missions that plan to use the RAD5500 (the successor to the RAD750)? Or are they bypassing that and working on something more modern?

Do you have any references (e.g. a PowerPoint from a NASA presentation) showing JWST's ability to collect data was limited by the RAD750? That would be interesting to read. I would have thought if that were an issue, they would have equipped the JWST with multiple RAD750's, tiled the images, and used separate RAD750's to losslessly compress each tile.
 
Last edited:
I do recall the story about the helicopter, but I thought that was a one-off, and that that the RAD750 generally did the job (and that most of the computational tasks were offloaded to ground-based systems). But it does make sense that, if they want probes to act more autonomously, then more local processing will be needed.

Are you aware of any missions that plan to use the RAD5500 (the successor to the RAD750)? Or are they bypassing that and working on something more modern?

Do you have any references (e.g. a PowerPoint from a NASA presentation) showing JWST's ability to collect data was limited by the RAD750? That would be interesting to read. I would have thought if that were an issue, they would have equipped the JWST with multiple RAD750's, tiled the images, and used separate RAD750's to losslessly compress each tile.
I'm afraid I don't have references to hand - I'm relying on memory of things I've read before. And that memory could be unreliable.

re: RAD5500, I did find references but none made it clear what its status is. The program I was referring to is JPL's High Performance Spaceflight Computing (HPSC), and is a much bigger step forward than the RAD5500, which sounds like a minor increment on RAD750.

Apparently the original HPSC contractor (Boeing) was going to design a chip (or chip family) based on Arm Cortex-A53, but for some reason that has been cancelled. In 2022 JPL selected Microchip as the new contractor and their HPSC SoC is based on RISC-V cores designed by SiFive. Microchip claims to also be designing it to be useful in some terrestrial industrial computing applications, which is sensible if they can pull it off.
 
the RAD5500, which sounds like a minor increment on RAD750

The RAD5500 appears to be a 64-bit G3. Bear in mind that the baseline 64-bit PPC ISA has something like 3 added instructions but, from a programmer's perspective, is very nearly identical to 32-bit PPC. The PPC 64-bit memory paging scheme is a mess, but, on a space probe, it might be practical to just forego paging and use real memory for everything, saving a lot of work for the processor. Proper memory system design might even make using real memory reasonably safe.

Despite the fact that it is a G3, which was not designed to support multi-processor/multi-core systems, there is at least one RAD5500 that has four cores. Of course, being a G3-type design, it does not have Altivec.
 
Apparently N2 volume production will be ready in early 2025, which is good. It should be on time for next year’s iPhone and potentially Macs depending on Apple’s release schedule for them.


So N2P loses backside power delivery:


To A16:

 
2) CPU. An ability to separate desktops from laptops. Does the Studio really need E cores? It would be preferable to have an all P-core desktop chip. Perhaps also the ability to scale frequency higher.

I'd say that desktops still have smaller background threads going on and you can fit more smaller e-cores on a die to handle that crap than p-cores.

Never mind power efficiency - see what intel are doing with silk purses out of cows ears, etc.... they've managed to crank the core counts way up for both marketing and performance by leveraging the e-cores. In a modern machine there's so much non-urgent background crap going on that the e-cores can just take care of and leave the p-cores totally free to do the real work.

Not that the intel e-cores are particularly slow. They're still Skylake sort of performance if I'm not mistaken.
 
Last I looked, N3P is supposed to begin production in a few weeks. I am thinking N3E will be good enough for the A18, but the Mac version of M4 will be on N3P. This is why there were no device announcements at WWDC. The M4 books, and perhaps the Mini, will be announced in November, per the usual cycle, with the Studio and Pro probably showing up next spring. Unless, of course, they are saving those updates for M5 on N2.
 
Last I looked, N3P is supposed to begin production in a few weeks. I am thinking N3E will be good enough for the A18, but the Mac version of M4 will be on N3P. This is why there were no device announcements at WWDC. The M4 books, and perhaps the Mini, will be announced in November, per the usual cycle, with the Studio and Pro probably showing up next spring. Unless, of course, they are saving those updates for M5 on N2.
You think they'd have two different nodes for the same chip? M4 on E and P for iPad and Mac respectively? I'd be more inclined to think, if anything, that only Pro, Max and Ultras would be on N3P - All this assuming the two processes are entirely layout compatible
 
You think they'd have two different nodes for the same chip?
I would think that "M4" is a µarch that could be adapted to node variants. If I am very much mistaken, N3P and N3E both have FinFlex, which means to me that the structural differences between the two are fairly minimal. N3P has a small reduction in size, and I suppose some electrical differences, but it is pretty similar. Cliff can tell us if they could just order pizza.
 
I would think that "M4" is a µarch that could be adapted to node variants. If I am very much mistaken, N3P and N3E both have FinFlex, which means to me that the structural differences between the two are fairly minimal. N3P has a small reduction in size, and I suppose some electrical differences, but it is pretty similar. Cliff can tell us if they could just order pizza.

If i understand correctly, N3P shrinks some stuff compared to N3E. I don’t have the design rules for either process. It’s possible you can just fab N3E masks on N3P, in which case they might, but there wouldn’t be much advantage to doing so. If, instead, you want to take advantage of N3P’s shrink, then you may be able to do a linear shrink (where you simply take an existing hard IP block and run an equation on each point that scales everything.). But typically not everything scales by the same amount. You may have some metal layers scale by different percentages than others. And wire height seldom scales proportional to lateral shrinkage. So you would likely have to re-do your physical design of existing blocks, or at least run them through static timing analysis and other checks. That could be a lot of efffort.

Another alternative is that the initial design of the cores was done keeping in mind N3E and N3P, and was designed using a hybrid set of design rules that works on both. That’s what we did a couple of times at AMD. You don’t get the full benefits of N3P (or N3E), but you can switch from one to the other cheaply and quickly.
 
Back
Top