M3 core counts and performance

I don‘t buy the claim of it being comparable to 16GB on a Windows laptop as a blank statement, but I’m sure there are some cases where this is approximately true. After all Apple engineers have spent the most time profiling this and we still got a base config of 8GB for all M3 (non Pro/Max) computers.

However the opposite can also be true (relevant Twitter thread below):
 
This is absolute bullshit 🙄

I think it's really embarrassing. I mean, if he would have argued about better responsiveness with multiple apps, or even more efficient apple frameworks and apps (still BS, but at least palatable), I'd be ok with it. But this kind of statement is exactly what will be spread all around as an example of Apple's delusion (and rightfully so).
 
The statement from Apple regarding memory is just nonsense guys.
  1. Memory compression is not unique to macOS. Windows has it. Using this to justify under-speccing memory is asinine.
  2. Unified memory is not magic! It doesn’t mean you need less memory. It’s disappointing that Apple would even suggest that.
“Actually, 8GB on an M3 MacBook Pro is probably analogous to 16GB on other systems.” Is just rubbish. If your workflow needs 16GB on Windows, it needs 16GB on macOS. There’s no debate here - it annoys me that we’re even having this conversation.

Swap is not a replacement for memory. Fast SSDs are not a replacement for memory.

You might intuitively think that the performance of modern SSDs makes swapping less of an issue, but this is, in general, not true. The problem is not just throughput but latency - memory access is <150ns, SSD access is typically >10,000ns. This matters a lot! The penalty from going to disk is severe. There’s no point having a shiny new M3 if it spends most of the time waiting for memory.

You can get away with 8GB for casual use, but this isn’t a casual machine.
 
I think it's really embarrassing. I mean, if he would have argued about better responsiveness with multiple apps, or even more efficient apple frameworks and apps (still BS, but at least palatable), I'd be ok with it. But this kind of statement is exactly what will be spread all around as an example of Apple's delusion (and rightfully so).
Agree completely. It pisses me off because there are legitmate benefits to Apple’s approach that should be celebrated, they don’t need to lie.
 
I don't remember where on this forum it is but back when M1 launched we did test a number of apps that had both Windows and Mac versions. The Mac versions WERE smaller in size as regards executables and DID use less RAM. It seemed to work out to around 25% less on average. So again it depends on your workflow - is the base M3 appropriate or do you require (or prefer) the M3 Pro? The Pro M3 starts at 18GB of RAM. Sticking 16GB on a base M3 doesn't necessarily make it appropriate for what you do - it might but the higher counts seem better paired with the more capable SOCs.
 
I don‘t buy the claim of it being comparable to 16GB on a Windows laptop as a blank statement, but I’m sure there are some cases where this is approximately true. After all Apple engineers have spent the most time profiling this and we still got a base config of 8GB for all M3 (non Pro/Max) computers.

However the opposite can also be true (relevant Twitter thread below):
The statement from Apple regarding memory is just nonsense guys.
  1. Memory compression is not unique to macOS. Windows has it. Using this to justify under-speccing memory is asinine.
  2. Unified memory is not magic! It doesn’t mean you need less memory. It’s disappointing that Apple would even suggest that.
“Actually, 8GB on an M3 MacBook Pro is probably analogous to 16GB on other systems.” Is just rubbish. If your workflow needs 16GB on Windows, it needs 16GB on macOS. There’s no debate here - it annoys me that we’re even having this conversation.

Swap is not a replacement for memory. Fast SSDs are not a replacement for memory.

You might intuitively think that the performance of modern SSDs makes swapping less of an issue, but this is, in general, not true. The problem is not just throughput but latency - memory access is <150ns, SSD access is typically >10,000ns. This matters a lot! The penalty from going to disk is severe. There’s no point having a shiny new M3 if it spends most of the time waiting for memory.

You can get away with 8GB for casual use, but this isn’t a casual machine.
The one time I’d argue Apple can make a case is actually the case @Andropov linked to as a counter example: gaming. But his counter example is mostly applicable to Windows. Windows laptops with integrated graphics tend to hard split their memory pool between the CPU and GPU - at least they did, I haven’t checked recently but my understanding that is still the case. So effectively your Windows laptop with an iGPU with 16GB of RAM is really only 8+8. While Apple very oddly doesn’t have the unified virtual memory pool that Nvidia uses, they don’t do this. Both CPU and GPU have full access to the same RAM and you don’t need to duplicate memory and transfer memory between partitions*. Programmatically you still treat the CPU and GPU separately, again no unified virtual addressing, but physically you don’t. So Apple has some odd omissions to its programming model that could make that even easier for developers. But overall that’s the one time, where the GPU and CPU require access to the same memory pool, where unified memory could make a difference and you can almost make that claim with a straight face. But only almost as I have to point out the narrowness of this claim: note I said this was only in comparison to PC laptops with integrated graphics which are generally on the cheap end in more ways than one. As Seb says in @Andropov ’s link, if you have a GPU with its own memory (which is common in the high end PCs that Apple actually competes against) this doesn’t apply. Again for the Mini or even the Air you might, might get away with the comparison. But for the $1600 MacBook Pro where the 8GB of base RAM is most egregious? Nah. They don’t get use dinky integrated graphics PCs for that comparison. But I bet you that, if there is any reasoning behind their statement, that’s what they are doing and then acting like it’s a general claim they can make.

Also, just to push gently back on the SSD comments, yes going to SSD swap is very slow but depending on your application the swap penalty can be hid if you have enough other work to do so that the data can be streamed. Again, there are limits to this, more RAM for big problem sets will be better, sometimes dramatically so, and it’s not like this logic doesn’t hold for PCs. And similarly you’re right that Windows likewise has memory compression. My understanding is similar to @Joelist that Apple’s memory compression is better and they’re smarter about swap memory but no they aren’t going to turn 8GB into a 16GB equivalent through that alone. It’s also possible some of the tested applications Joelist alludes to were more tightly compiled/programmed on the Mac-side than the Windows side.

So I’m not disagreeing with your overall assessment of Apple’s statement. It’s mostly BS. I just want to inject some nuances here though.

EDIT: *I'd have to double check that with @leman that transferring memory between one that's seen by the GPU to one that is seen by the CPU (and visa versa) doesn't require at least temporary duplication physically for Apple Silicon. He's much more of an expert on that than I. Even if not, it still would be far more flexible than transferring between hard memory partitions, but obviously not as beneficial as not having to do it at all. Also for gaming at least some of the newer capabilities like DirectStorage APIs and the like can be benefit here for Windows, transferring memory directly from the disk to the GPU memory but I think that really only helps dGPU setups, not iGPU. I'd also be curious if more Windows laptops with iGPUs only have adopted a similar approach to Apple's since the Apple Silicon release. I don't believe that there is a technical limitation why they couldn't - I bet the Qualcomm devices do it and if they don't, I bet the upcoming Elite devices will.
 
Last edited:
I’m not too familiar with PassMark’s benchmarks, but I thought I’d mention it here as it was forwarded to me. M3 leading the pack.
1699471426695.png
 
I don't remember where on this forum it is but back when M1 launched we did test a number of apps that had both Windows and Mac versions. The Mac versions WERE smaller in size as regards executables and DID use less RAM. It seemed to work out to around 25% less on average. So again it depends on your workflow - is the base M3 appropriate or do you require (or prefer) the M3 Pro? The Pro M3 starts at 18GB of RAM. Sticking 16GB on a base M3 doesn't necessarily make it appropriate for what you do - it might but the higher counts seem better paired with the more capable SOCs.
There was this discussion: https://techboards.net/ams/cpu-design-part-5-caches.8/#ams-comment-27

Adapting code supplied by @mr_roboto , I compared these binaries. In each case the x86 binary was larger, but only by a small margin:
1699474473578.png
 
I think it's really embarrassing. I mean, if he would have argued about better responsiveness with multiple apps, or even more efficient apple frameworks and apps (still BS, but at least palatable), I'd be ok with it. But this kind of statement is exactly what will be spread all around as an example of Apple's delusion (and rightfully so).
Yeah, sometimes Apple marketers just bizzarely put their feet in their mouths, saying stuff that so obviously falls into the realm of "alternate facts" that they hurt their cause.

The last time I recall them doing this this was at the WWDC when they announced the Pro Display XDR. They presented it as outperforming the $43,000 Sony Trimaster HDR Mastering Monitor—when they surely must have known that it wouldn't be able to pass Dolby/Netflix facility certification to be qualified as such (which it wasn't)—setting themselves up for nothing but embarrasment when the colorists got their hands on it. If their goal was to damage their creditibility with that community, they succeeded.

Of course, many of the other companies in this sector are as bad or worse.
 
Possibly, but you would need a TB/1384 adapter.
I still have some iPods synced to my M2 Mini.
TB3 to TB2 -> TB2 FireWire 800 -> FireWire 400 adapter -> iPod 1st/2nd generation still works!

Wonder how Zune support is doing these days, hmm… 😉
 
I haven't been able to sync my 7th-gen iPod Nano to my 2019 iMac running Monterey, because of a DRM issue with Audible audiobooks on Apple's Books app. I can only plug it into my 2014 MBP, since that runs High Sierra and thus still has iTunes. I went up to the highest tiers with both Audible and Apple support on this. The Audible engineers claim it's Apple's fault, and say Apple has been non-responsive to their bug report. The Apple engineers claim it's Audible's fault. Sigh.

Indeed, I can't even plug my iPod into my iMac for charging, because if I do that it deletes all my Audible audiobooks.

I never thought it was a good idea to have the Books app handle audiobooks, because they are audio files, and thus are probably better handled by the Music app (i.e., I think it should have been based on format rather than content). If they were handled by the Music app, and DRM weren't an issue there, then one could sync both audiobooks and music to an iPod (or whatever audio device you're using) with just a single app, as you can now do with iTunes.

If I want to be able to use my iPod with my iMac, my only solution will probably be to remove the DRM's, which I don't want to bother with (esp. if I have to remove the protections one by one instead of doing it as a batch).
 
There was this discussion: https://techboards.net/ams/cpu-design-part-5-caches.8/#ams-comment-27

Adapting code supplied by @mr_roboto , I compared these binaries. In each case the x86 binary was larger, but only by a small margin:
View attachment 27191

There is a bit of irony there, as Intel is forever touting their compact and flexible byte-code instructions but the big fat ARM 32-bit-wide instructions yield smaller executables. This is probably at least in part due to compilers, that generate optimized code, but still, if each instruction is four times as wide (which is almost never the case), how do you end up with denser code? (No need to answer me, I know.)
 
No points for guessing where this thread is currently the top forum topic:

1699491593841.png


🤷‍♂️

In fairness the titles of the other top forum threads are now more reasonable, just run of the mill stuff. I guess the doom and gloom has died down for now and despite claims to the contrary from said doomers, reactionary posts like the above are more the rarity than not - well more rare than the doomers at MR anyway. That doesn’t make it any more helpful for starting conversations on a decent foot mind you.
 
Annotated M3 die shots by Twitter X user High Yield.

For a sharper image than that displayed here (this site seems to compress images), go to:

Locuza's annotated M1 die shots shown below for comparison (https://twitter.com/Locuza_/status/1450271726827413508/photo/1)

As others have noted, the M3 Max is not a Pro with certain parts doubled (mirror-imaged). Correspondingly, instead of having two Pro NPUs at opposite ends of the chip (as seen on the M1 Max), it has a single larger NPU. Likewise, the Display Engines are clustered together on the I/O end of the chip, instead of being separated.

I'm confused about the number of Display Engines, since in each case those are equal to the number of external displays the chip supports. The total number of supported displays is one more than that, so did he miss one? Assuming it is one Display Engine per external, it does take up a decent percentage of the M3's die area, so perhaps Apple's decision not to allow that chip to drive a total of three isn't purely product segmentation.

ADDENDUM: According to @mr_roboto, he did indeed miss the internal display engine on each chip, which are less powerful than the external display engines, and thus don't look the same. See https://techboards.net/threads/m3-core-counts-and-performance.4282/page-21

1699495275386.png

1699496604608.png
 
Last edited:
Here’s an article I’ve seen on twitter, retweeted from some people I respect.

Nothing groundbreaking, just comparing the M3 to Intel and Qualcomm. with some speculation.
 
“Actually, 8GB on an M3 MacBook Pro is probably analogous to 16GB on other systems.” Is just rubbish. If your workflow needs 16GB on Windows, it needs 16GB on macOS. There’s no debate here - it annoys me that we’re even having this conversation.

Swap is not a replacement for memory. Fast SSDs are not a replacement for memory.

There is however one interesting empirical observation: Apple Silicon Macs do retain high responsiveness in multi-app scenarios with high contention. As long as you are working with one app at a time and the working set of that app fits in your memory, you can fluently multi-task a Mac, where a Windows machine will start having major issues. I don't think there was an in-depth investigation into it, so we don't know exactly what the difference is (probably a combination of more efficient kernel algorithms, hardware memory compression, and large memory pages). But it is indeed the case that for general multitasking use, an 8GB Mac might serve you better than an 8GB Windows computer. The key point is that the working size for any single app needs to remain under the RAM threshold.

If the Apple VP would have put it like this (our systems are more efficient at using RAM while multi-tasking, so you can still use multiple demanding apps without stuttering), I'd be fine with the statement. It is empirically defendable. But the way he phrased it was just an embarrassment for the company.

As others have mentioned, Windows also compresses memory. Here's a brief discussion of it:

One important nuance is that Apple Silicon has a dedicated coprocessor for compressing memory pages. So memory compression on AS is faster and does not block the CPU.
 
Back
Top