M3 core counts and performance

Given that Apple titled the event "Scary Fast"—which is pretty on-the-nose for them; as you know, their event titles are typically cryptic—I'm guessing we'll be seeing M3's. And perhaps even M3's with boosted clocks.
Or the Apple Car and a partnership with BMW, which explains all of these “M3” leaks 😋
 
Gurman says Apple will only be updating the 14"/16" MBP and the 24" iMac. Unless they're planning to redesign the Mini, 13"/15" Air, and 13" MBP (or discontinue the latter), I'm wondering why Apple wouldn't update these at the same time, since they use the same chips. It seems it would be in their financial interest to do so before the holidays.

Could it be due to chip availability? Yes, even if the case stays the same, there is some internal redesign needed in changing the Air/Mini/13" MBP from M2 to M3. But surely Apple's Mac divison—which would be a Fortune 100 company if ranked by itself — has the resources to develop and release seven Macs in parallel. That's what makes me wonder if it's a chip limitation, or some other external factor.

Alternately, perhaps they want to wait until N3E is available before updating those other models, to reduce manufacturing costs. But I don't know if that by itself would outweigh the added earlier income from an earlier release.
My guess is chip availability. The 13” air/pro and probably the 15” air are amongst their highest volume sellers - minis too. They probably can’t make enough regular M3 chips for those yet. The Pro and Max chips in MacBook Pros take up more wafer area but they probably sell much much less of those by volume with corresponding higher profit margins that it’s probably okay. The 24” iMac is probably their lowest volume highest margin base M3. This is all just a guess.
 
Gurman says Apple will only be updating the 14"/16" MBP and the 24" iMac. Unless they're planning to redesign the Mini, 13"/15" Air, and 13" MBP (or discontinue the latter), I'm wondering why Apple wouldn't update these at the same time, since they use the same chips. It seems it would be in their financial interest to do so before the holidays.

Could it be due to chip availability? Yes, even if the case stays the same, there is some internal redesign needed in changing the Air/Mini/13" MBP from M2 to M3. But surely Apple's Mac divison—which would be a Fortune 100 company if ranked by itself — has the resources to develop and release seven Macs in parallel. That's what makes me wonder if it's a chip limitation, or some other external factor.

Alternately, perhaps they want to wait until N3E is available before updating those other models, to reduce manufacturing costs. But I don't know if that by itself would outweigh the added earlier income from an earlier release.
Whatever the reason I’m sure it’s about numbers and not because of redesign. M2 air look is fresh. Released with the M2. Sure mini could change but I doubt it will. Very rack-friendly to existing installations.
 
From Gurman’s latest article on Bloomberg:
"The M3 chip line is destined to be a considerable leap from the M2, bringing vastly improved speeds as well as better efficiency to improve battery life on notebooks."

Things you love to see! (if it’s true)

 
Last edited:
From Gurman’s latest article on Bloomberg:
"The M3 chip line is destined to be a considerable leap from the M2, bringing vastly improved speeds as well as better efficiency to improve battery life on notebooks."

Things you love to see!

Well, i don’t think anyone expected M3 to be slower than M2; Apple isn’t Intel. :-)
 
Gurman wrote the following in his latest article:

"M3 Max: This chip has also been tested in different configurations, including one with 16 CPU cores (12 for performance and four for efficiency) and a whopping 40 graphics cores." [emphasis mine]

If he's right about the core counts, he's misplaced the bolded adjective. The "whopping" should be applied to the 16 CPU cores (which would would be a 33% increase over the M2 Max's 12), and not the 40 GPU cores (just 5% more than the M2 Max's 38—any significant increase in GPU power will come from an increase in GPU core performance rather than GPU core number).

That would also represent a qualitative change from the M1/M2 lines, where the Max had/has the same CPU core count as the top-end Pro. [He's predicting the maximum CPU core count for the M3 Pro will be unchanged from the M2 Pro's 12.]

[I'm hoping that a 33% increase in the Max's CPU core count isn't the only thing that accounts for the "scary fast" language—I'd really like to see a significant boost in SC speeds.]

 
Last edited:
[I'm hoping that a 33% increase in the Max's CPU core count isn't the only thing that accounts for the "scary fast" language—I'd really like to see a significant boost in SC speeds.]

Yeah absolutely. When I emphasised “vastly improved speeds” I’m hoping it refers to single core performance improvements. There seems to be this thing lately where (Intel/AMD) fans dismiss single core and obsess over multi core. This couldn’t be more wrong for most users in my opinion. Multi core improvements are great, but nothing makes a noticeable difference like much faster single core performance.
 
Well, i don’t think anyone expected M3 to be slower than M2; Apple isn’t Intel. :)
Now now, in fairness to Intel, the 14th Gen isn't actually slower than the 13th gen... It's just the same speed. Just like the 10 and 11 series.


Yeah absolutely. When I emphasised “vastly improved speeds” I’m hoping it refers to single core performance improvements. There seems to be this thing lately where (Intel/AMD) fans dismiss single core and obsess over multi core. This couldn’t be more wrong for most users in my opinion. Multi core improvements are great, but nothing makes a noticeable difference like much faster single core performance.
I mean, in some sense it makes sense in that the M1's single core speed is pretty good and if your main usage is web and Word, how much more do you really benefit from a bit more speed? Sure, it's mainly single-core that matters but how much speed do you need? For the tasks that can make use of many threads, the speed tends to matter more.
 
I mean, in some sense it makes sense in that the M1's single core speed is pretty good and if your main usage is web and Word, how much more do you really benefit from a bit more speed? Sure, it's mainly single-core that matters but how much speed do you need? For the tasks that can make use of many threads, the speed tends to matter more.
Ohh respectfully disagree. Multi core is mostly wasted on ordinary users, and a result of the industry’s inability to scale single core performance.

Increasing single core performance is the best way to have a noticeable improvement on every application. It requires no adaptation from devs and is immediately applicable. MultI core scaling is messy and often only applicable to certain tasks.

Now certain tasks can definitely benefit from multiple cores, server tasks and encoding for example, but given a choice, I’d almost always take a fast single core cpu.
 
Last edited:
I mean, in some sense it makes sense in that the M1's single core speed is pretty good and if your main usage is web and Word, how much more do you really benefit from a bit more speed? Sure, it's mainly single-core that matters but how much speed do you need? For the tasks that can make use of many threads, the speed tends to matter more.
I don't know what programs those who were dismissing SC performance are using, but current applications that would benefit from increased single-core CPU performance include gaming, many scientific programs, and certain office tasks (e.g., converting a PDF to searchable form using Adobe's optical character reader*). Also, no benchmark directly measures responsiveness, so it's possible an M1 might still not be as reponsive as someone would like when doing heavy office use (I can't say myself, but I'll have a chance to test one soon).
[*see https://techboards.net/threads/request-for-adobe-acrobat-pro-benchmarking.3965/ ]

More broadly, while a (say) 20% increase in SC performance probably wouldn't by itself be noticeable to the end user, that doesn't mean companies should dismiss the importance of trying to work hard to achieve that increase each year. Because if you didn't, you'd lose out on a 1.2^10 = 6 x performance increase once 10 years had passed. And that would put you in a bad position, because apps and OS's are constantly increasing in capability and (yes) bloat, which means their overhead increases with time. Today's chip performance won't be enough to handle 2034's software well.
 
Last edited:
Increasing single core performance is the best way to have a noticeable improvement on every application. It requires no adaptation from devs and is immediately applicable. MultI core scaling is messy and often only applicable to certain tasks.

Now certain tasks can definitely benefit from multiple cores, server tasks and encoding for example, but given a choice, I’d almost always take a fast single core cpu.
Don't get me wrong. If you offer me two equivalently powerful CPUs in total performance, where one has all its performance on one core; Can push all its power on a single thread, and the other is a 200 core chip; I'd pick the single core one. But it doesn't really scale that way. There are tasks that don't really parallelise; Single core matters. But fortunately, most of the tasks that tend to keep you waiting for hours do parallelise, and cutting 20% on those tasks tend to be more impactful than cutting 20% on a task that takes 2ms. I didn't want to make it sound like single-core doesn't matter, it's still very important. But I personally feel like general use responsiveness, is good enough to where an average person doesn't benefit much from increase SC performance (n'or MC performance for that matter) beyond the level of M1, with current consumer software - for the most part.
I don't know what programs those who were dismissing SC performance are using, but current applications that would benefit from increased single-core CPU performance include gaming, many scientific programs, and certain office tasks (e.g., converting a PDF to searchable form using Adobe's optical character reader*). Also, no benchmark directly measures responsiveness, so it's possible an M1 might still not be as reponsive as someone would like when doing heavy office use (I can't say myself, but I'll have a chance to test one soon).
[*see https://techboards.net/threads/request-for-adobe-acrobat-pro-benchmarking.3965/ ]

More broadly, while a (say) 20% increase in SC performance probably wouldn't by itself be noticeable to the end user, that doesn't mean companies should dismiss the importance of trying to work hard to achieve that increase each year. Because if you didn't, you'd lose out on a 1.2^10 = 6 x performance increase once 10 years had passed. And that would put you in a bad position, because apps and OS's are constantly increasing in capability and (yes) bloat, which means their overhead increases with time. Today's chip performance won't be enough to handle 2034's software well.
Indeed. Still hugely important. And regardless, faster SC performance also helps MC performance anyway
 
But fortunately, most of the tasks that tend to keep you waiting for hours do parallelise.
I'd say that really depends on the kind of work you're doing. The app that currently keeps me waiting for minutes to hours is Mathematica, and that's single-threaded (except for a very limited subset of tasks).

Also—and this question just occurred to me—are the types of apps that are multi-threaded expanding? I don't know the answer, but my speculation is that it's not, and thus that, even as consumer CPU core counts continue to increase, the types of apps that can make use of them is not.

What I mean is that it appears multi-threading remains confined to those apps whose work can be easily broken up into embarrassingly parallelI tasks, e.g., video, photography, and multi-track audio—I.e., that multi-threaded apps have not been appearing for tasks that have complex interdependencies, and have thus far been single-threaded only, since coding them to take advantage of multi-core CPUs is extremely challenging (and, depending on the interdependencies, might not have a large payoff).

Are there significant counterexamples to this?
 
Last edited:
I'd say that really depends on the kind of work you're doing. The app that currently keeps me waiting for minutes to hours is Mathematica, and that's single-threaded (except for a very limited subset of tasks).

Also—and this question just occurred to me—are the types of apps that are multi-threaded expanding? I don't know the answer, but my speculation is that it's not, and thus that, even as consumer CPU core counts continue to increase, the types of apps that can make use of them is not.

What I mean is that it appears multi-threading remains confined to those apps whose work can be easily broken up into embarrassingly parallelI tasks, e.g., video, photography, and multi-track audio—I.e., that multi-threaded apps have not been appearing for tasks that have complex interdependencies, and have thus far been single-threaded only, since coding them to take advantage of multi-core CPUs is extremely challenging (and, depending on the interdependencies, might not have a large payoff).

Are there significant counterexamples to this?

I'm not entirely sure, but I can say this. These days I work with newspaper and magazine apps and our entire app is very multi-threaded, and it's not exactly a pro task. - Now granted, our threading is more used to do asynchronous work than it is to do parallel work, but we do it via threads, so it does happen in parallel though often limited by network or disk.

Some kinds of work are inherently not that parallel so there's certainly going to be categories of computation that will never efficiently be threaded. However, there are also algorithms that are significantly more efficient in their sequential version than their parallelised version; However, with massively multi-core machines, on certain dataset sizes, the parallel version is still faster
 
Everyone here probably knows this, but I’m writing this down primarily for my own thought process.

The M1 is based on the A14, and the M2 is based on the A15 and it seems that the M3 is based on the A17.

A14 = 3.0 Ghz = 2000 GB 6 -> M1 = 3.2 Ghz = 2300 GB 6
A15 = 3.2 Ghz = 2200-2300 GB 6 -> M2 = 3.5/3.7 GHz = 2600-2800 GB 6
A17 = 3.7 Ghz = 2900 GB 6 -> M3 = 3.9/4.1 Ghz = 3100/3300 GB 6 (?)

Obviously the M3 is a guess. Does it seem reasonable?
 
Everyone here probably knows this, but I’m writing this down primarily for my own thought process.

The M1 is based on the A14, and the M2 is based on the A15 and it seems that the M3 is based on the A17.

A14 = 3.0 Ghz = 2000 GB 6 -> M1 = 3.2 Ghz = 2300 GB 6
A15 = 3.2 Ghz = 2200-2300 GB 6 -> M2 = 3.5/3.7 GHz = 2600-2800 GB 6
A17 = 3.7 Ghz = 2900 GB 6 -> M3 = 3.9/4.1 Ghz = 3100/3300 GB 6 (?)

Obviously the M3 is a guess. Does it seem reasonable?

I think it looks very reasonable. Also consistent with Apple's claims, more or less. We should know soon enough.
 
This is a very interesting breakdown over at anandtech ….

They speculate that the reason we are seeing different TOP numbers for A17pro and M3 is down to the precision….

A17pro apple communicated an 35 TOPS number at INT8 precision. Anandtech speculates that M3 with a figure of 18 TOPS is likely quoting INT16/FP16 precision. M1 and M2 communicated results based off of INT16/FP16.

Anandtech also asks (what I feel will be a very interersting thing to explore with live production samples) in the hands of real users - does M3 also support INT8 and allow to trade precision for throughput?!?!

This was probably the biggest head scratcher for me of the entire keynote!
 
This is a very interesting breakdown over at anandtech ….

They speculate that the reason we are seeing different TOP numbers for A17pro and M3 is down to the precision….

A17pro apple communicated an 35 TOPS number at INT8 precision. Anandtech speculates that M3 with a figure of 18 TOPS is likely quoting INT16/FP16 precision. M1 and M2 communicated results based off of INT16/FP16.

Anandtech also asks (what I feel will be a very interersting thing to explore with live production samples) in the hands of real users - does M3 also support INT8 and allow to trade precision for throughput?!?!

This was probably the biggest head scratcher for me of the entire keynote!
I think @leman might have said that already? Also I believe Qualcomm’s npu is int4.
 
First geekbench scores for base M3
1698844497029.jpeg



Running at 4 Ghz. A17 at 3.7 is close. Thought it might be a bit higher. Perhaps background work going on?
 
Last edited:
Back
Top