M5 Pro and Max unveiled

Nearly all of the reviews of the M5 have been so badly produced I can't take this conclusion seriously, especially not when it's contradicting other reviews I've seen.

We need more testing from more people, because the thing I've pointed out that no one is talking about: it can outperform M3 with 80 cores and 819 GBs of bandwidth at full load. Something isn't adding up here for their review. They could have simply gotten bad chip, which is unlikely but possible.
I haven't seen any other reviewers with the 14" M5 Max model - I don't know why they shipped that to NotebookCheck. Also some of the gaming data is corroborated with other reviews. So the Max GPU appears to be inconsistent even with the 16" model. Though it isn't all bad, the 14" M5 Max is ridiculously efficient in CP2077 in their data, even compared to the other M5s tested (well the very low power M5s in the Airs are still better, but the 14" Max is pushing that level of efficiency). Most of the odd 14" results could be explainable by a chip that is throttling or at least being deliberately constrained more than it should be, maybe a "hot" chip or a fan curve that isn't working right - setting it to High power mode as they did *should* have alleviated that, but there are other power curves than just the fan. We know that the 14" has historically struggled to cool the full Max chip in the past and we know occasionally Apple has shipped devices with ... imperfect cooling/power curves that had to be corrected after launch (typically iPhones more than Macs). So that's another possibility, beyond a bad/hot chip - which as you say is unlikely but possible.

That said their CBR24 data for 16" M5 Pro review bothers me more, that MT result doesn't just look wrong, it should not be possible - have you seen any other reviewers with that model? The other results for that model seem sensible enough (including ST), but ... that one ...

Unfortunately very few outlets measure wall power efficiency, at most they might look at powermetrics (which is not bad data, but not complete and not useful for my purposes). Notebookcheck isn't the only one that does so, but there aren't many and even fewer that do so regularly and none so easy to get lots of data for.
 
Last edited:
I haven't seen any other reviewers with the 14" M5 Max model. Also some of the gaming data is corroborated with other reviews. So the Max GPU appears to be inconsistent even with the 16" model. Though it isn't all bad, the 14" M5 Max is ridiculously efficient in CP2077 in their data, even compared to the other M5s tested (well the very low power M5s in the Airs are still better, but the 14" Max is pushing that level of efficiency). Most of the odd 14" results could be explainable by a chip that is throttling, maybe a "hot" chip or a fan curve that isn't working right (though setting it to High power mode as they did *should* have alleviated that). We know that the 14" has historically struggled to cool the full Max chip in the past and we know occasionally Apple has shipped devices with ... imperfect cooling curves that had to be corrected after launch (typically affects iPhones more than Macs). So that's another possibility, beyond a bad/hot chip.

That said their CBR24 data for 16" M5 Pro review bothers me more, that MT result doesn't just look wrong, it should not be possible - have you seen any other reviewers with that model? The other results for that model seem sensible enough (including ST), but ... that one ...

Unfortunately very few outlets measure wall power efficiency, at most they might look at powermetrics (which is not bad data, but not complete and not useful for my purposes). Notebookcheck isn't the only one, but there aren't many and even fewer that do so regularly.
I understand being confused by the results, but that's my point: if you can't even produce an article or video without typos, errors, flaws then I question it to begin with, down to the testing and conclusions. Nearly ALL of reviews for m5 have some sort of glaring problem in the writing or info.

Why should i trust their testing

We just wait for more testing. I still come back to the real world performance: running a 70B 6 bit transformer model, a diffusion model, and a video render at the same time on both M3 (32/80/512/819) vs M5 (18/40/128/614), and the M5 beats the highest end M3 from start to finish, 1-3 minutes faster.

yeah yeah AI and all, I've said repeatedly I don't like "AI," but it's zero argument that running 70B 6Bit TMs and diffusion models and 8K/4K render video against a machine with 4X the memory, 200 GB/s more bandwidth, 2X the encoders/decoders, and it still loses out when fully maxed out. Neural Accelerators help a lot but don't fully explain the difference, especially at complete stress test maximum capacity.

I hope notebookcheck will review their review and redo it, maybe with a retail unit. But if the conclusions are so strange and outlandish, it's on them to ask Apple for a new unit to confirm results or ask potential solutions . They didn't, or at least didn't bother to wait to publish their review.

My opinion.
 
Max is pushing that level of efficiency). Most of the odd 14" results could be explainable by a chip that is throttling or at least being deliberately constrained more than it should be, maybe a "hot" chip or a fan curve that isn't working right - setting it to High power mode as they did *should* have alleviated that, but there are other power curves than just the fan. We know that the 14" has historically struggled to cool the full Max chip in the past and we know occasionally Apple has shipped devices with ... imperfect cooling/power curves that had to be corrected after launch (typically iPhones more than Macs). So that's another possibility, beyond a bad/hot chip - which as you say is unlikely but possible.
And yes, the highest tier chip in a smaller body is going to not be as thermally efficient as in a larger design, I don't dispute that. But I do question how accurate all the testing is. So we wait.

But performance wise, even for M5 Pro, I suspect that is wrong for one reason or another. Whether it's incorrect testing, Cinebench being useless as usual, a bad chip, or a software update that needs to be applied to correct a bug, I think it's safe to presume the performance will be better than what they're claiming.

Again, the most of the reviews were poorly done. I keep saying it because it matters on multiple levels.

And again, fully maxed out tests hitting all CPUs and GPUs and saturating bandwidth shows M5 Max beating an M3 using nearly 2X less cores (including exactly 4X less high-performance cores), 200 GB/s less bandwidth, 4X less memory, and 2X fewer encoders. on battery. It shouldn't be possible, yet it is.

So the M5 Pro with the same cpu core counts and only 2X fewer GPU cores should not produce weird results. Lower than M5 max, but not strange results.
 
Unfortunately very few outlets measure wall power efficiency, at most they might look at powermetrics (which is not bad data, but not complete and not useful for my purposes).

Just a quick comment on this - wall power on a battery-powered device can be problematic as well. You never know how the power manager is distributing the power draw internally. It’s best be done with a studio.
 
Is it possible that macOS still needs a bit more refinement and tuning for the Pro and Max and that we’ll likely see some future updates that eventually mitigate or even eliminate these inconsistencies? In the meantime as more units flood the market and get tested, the more data will be available for folks to analyze.
 
This doesn’t make sense. There is no way the M4 Max is faster than the M5 Max in Blender. Thermal throttling or not.
1773232801316.png

Also 100->83.2 is a 22% increase in performance. Not 17%.
 
Last edited:
Just a quick comment on this - wall power on a battery-powered device can be problematic as well. You never know how the power manager is distributing the power draw internally. It’s best be done with a studio.
In theory, true. In practice, these tests are long enough and use enough energy that you can see the battery drain if you are watching the device - which they do whenever they do on the longer stress tests that also exceed the power brick. Further, whenever they do use studios or minis those tests corroborate the tests done on laptops. Basically laptop makers, including Apple, usually behave sensibly. That said, there are times when I've seen some questionable results, but not often.

This doesn’t make sense. There is no way the M4 Max is faster than the M5 Max in Blender. Thermal throttling or not.
View attachment 38342
Also 100->83.2 is a 22% increase in performance. Not 17%.
22%? 1-83.2/100 = 0.168. 17% The other way is 100/83.2 = 1.202. 20% Which is what they report. They do change around +/- to make it clear to the user which is bad/good, but maybe they should just use red/green for that.

Screenshot 2026-03-11 at 8.53.02 AM.png


The Blender results are odd - in 3.3 they tie and shouldn't do that either. But as far as I can tell from their results, the Max chip is using 30W is less than it should be on GPU-heavy loads. It's possible that this is on NotebookCheck, but I don't think so.
 
Last edited:
In theory, true. In practice, these tests are long enough and use enough energy that you can see the battery drain if you are watching the device - which they do whenever they do on the longer stress tests that also exceed the power brick. Further, whenever they do use studios or minis those tests corroborate the tests done on laptops. Basically laptop makers, including Apple, usually behave sensibly. That said, there are times when I've seen some questionable results, but not often.


22%? 1-83.2/100 = 0.168. 17% The other way is 100/83.2 = 1.202. 20% Which is what they report. They do change around +/- to make it clear to the user which is bad/good, but maybe they should just use red/green for that.
I’m confused. I assume the numbers indicate a percentage increase in performance. Is that not the case?

I was taught it’s calculated (old-new)/new* 100.

You’re correct in the numbers though. Must be human error but calculator reported 21.9%. lol.
View attachment 38345

The Blender results are odd - in 3.3 they tie and shouldn't do that either. But as far as I can tell from their results, the Max chip is using 30W is less than it should be on GPU-heavy loads. It's possible that this is on NotebookCheck, but I don't think so.
 
I’m confused. I assume the numbers indicate a percentage increase in performance. Is that not the case?

I was taught it’s calculated (old-new)/new* 100.

You’re correct in the numbers though. Must be human error but calculator reported 21.9%. lol.
They have it as a percent change from the device you are currently highlighting with the default being the device under review. Lots of review sites do that so that when you hover over different devices you get the percentage change relative to that device.

So when the Max is highlighted (default), it's (83.2-100)/100*100=-16.8 and when the M4 Max is highlighted, its (100-83.2)/83.2*100=20.192. But then in addition to red/green they reverse the sign to make it clear that "lower is better". Personally I wouldn't do that, but I understand why they do.
 
They have it as a percent change from the device you are currently highlighting with the default being the device under review. Lots of review sites do that so that when you hover over different devices you get the percentage change relative to that device.

So when the Max is highlighted (default), it's (83.2-100)/100*100=-16.8 and when the M4 Max is highlighted, its (100-83.2)/83.2*100=20.192. But then in addition to red/green they reverse the sign to make it clear that "lower is better". Personally I wouldn't do that, but I understand why they do.
I’m going to assume it’s just me but I cannot comprehend describing it as +17% under any circumstances. Either the M4 is 20% better or the M5 is 17% worse. 🤷
 
I’m going to assume it’s just me but I cannot comprehend describing it as +17% under any circumstances. Either the M4 is 20% better or the M5 is 17% worse. 🤷
It's because of the "lower is better" - this is where playing to LCD audience (lowest common denominator - i.e. "idiot proofing") doesn't so much idiot-proof as confuse people. So they always want positive numbers associated with "better" and negative numbers associated with "worse" even in the case where they are reporting runtimes instead of scores. So both lower runtimes and higher scores get "+" and higher runtimes and lower scores get "-". I see why they do it, because people associate positive with good and negative with bad, but I agree that it's more confusing.
 
Last edited:
It's because of the "lower is better" - this is where playing to LCD audience (lowest common denominator - i.e. "idiot proofing") doesn't so much idiot-proof as confuse people. So they always want positive numbers associated with "better" and negative numbers associated with "worse" even in the case where they are reporting runtimes instead of scores. So both lower runtimes and higher scores get "+" and higher runtimes and lower scores get "-". I see why they do it, because people associate positive with good and negative with bad, but I agree that it's more confusing.
In that case, if that's what they want, what they really should do is turn it into rates rather than runtimes which naturally inverts these and you, correctly, get 20% faster and -17% slower.

(1/83.2-1/100)/(1/100)*100 = 20.192
(1/100-1/83.2)/(1/83.2)*100 = -16.8

but I suppose reporting 0.12 tests per second sounds odd ...
 
Now I’m even more confused! -20% when it should be +20% and +17% when it should be -17%
View attachment 38349

? I mean that's exactly what I showed in my post below, the exact same screenshot, and said verbally that they did multiple times. As I said, for benchmarks where lower is better they invert the sign so positive numbers are also associated with better and negative numbers are always associated with worse. That means they report +17 and -20 instead of -17 and +20.

22%? 1-83.2/100 = 0.168. 17% The other way is 100/83.2 = 1.202. 20% Which is what they report. They do change around +/- to make it clear to the user which is bad/good, but maybe they should just use red/green for that.

View attachment 38345

The Blender results are odd - in 3.3 they tie and shouldn't do that either. But as far as I can tell from their results, the Max chip is using 30W is less than it should be on GPU-heavy loads. It's possible that this is on NotebookCheck, but I don't think so.
 
? I mean that's exactly what I showed in my post below, the exact same screenshot, and said verbally that they did multiple times. As I said, for benchmarks where lower is better they invert the sign so positive numbers are also associated with better and negative numbers are always associated with worse. That means they report +17 and -20 instead of -17 and +20.
OK. Clearly I didn’t understand it. I’m sure it’s my issue.

I just don’t understand why accurate reporting of percentage differences is confusing. It makes sense to me to say something is +20% better when there is an increase of 20% in performance.
 
Blender 5.0
View attachment 38347
Cpu test for Blender 4.5
View attachment 38348
One issue to keep in mind with the CPU results is sadly opendata for whatever reason doesn't separate out different core counts for the same CPU name (which they do actually do for the GPU, so why they don't for the CPU is beyond me). So the M4 Max is the median of both the binned and full M4 Max results. So sadly you cannot just take the ratio of M5 Max to M4 Max. :(

OK. Clearly I didn’t understand it. I’m sure it’s my issue.

I just don’t understand why accurate reporting of percentage differences is confusing. It makes sense to me to say something is +20% better when there is an increase of 20% in performance.
I fundamentally agree but they wanted + to mean better and - to worse and they use runtimes instead of rates. Which gets us here and I agree it's not ideal.
 
One issue to keep in mind with the CPU results is sadly opendata for whatever reason doesn't separate out different core counts for the same CPU name (which they do actually do for the GPU, so why they don't for the CPU is beyond me). So the M4 Max is the median of both the binned and full M4 Max results. So sadly you cannot just take the ratio of M5 Max to M4 Max. :(


I fundamentally agree but they wanted + to mean better and - to worse and they use runtimes instead of rates. Which gets us here and I agree it's not ideal.
I didn’t mean to sidetrack this thread with my lack of mathematical knowledge!
 
I didn’t mean to sidetrack this thread with my lack of mathematical knowledge!
Let me be clear: it's not your lack of mathematical knowledge, quite the opposite, what you want is the correct way to do things - rather it's your mathematical knowledge confusing you! They're fundamentally doing it wrong. I understand why, or at least I have a reasonable hypothesis for why, they are doing it wrong, but they are doing it backwards because they assume most people won't understand why the better number is negative. They should either report that the M4 completed it in -17% less time than the M5 and the M5 took +20% longer than the M4 or the M4 is +20% more performant(faster) and the M5 is -17% less performant(slower), but instead they report +17% for the M4 with the M5 as the baseline and -20 for the M5 with the M4 as the baseline.
 
Last edited:
One issue to keep in mind with the CPU results is sadly opendata for whatever reason doesn't separate out different core counts for the same CPU name (which they do actually do for the GPU, so why they don't for the CPU is beyond me). So the M4 Max is the median of both the binned and full M4 Max results. So sadly you cannot just take the ratio of M5 Max to M4 Max. :(
We could take the top score. Still a nice uplift.
1773255824078.png
 
Back
Top