M3 core counts and performance

Blender has been updated to version 4.1. Not many results in the opendata database, but so far it’s a 7% increase in performance for the M3 Max (40 core). From 3485 -> 3743.

 
Very interesting finding....

That puts the 78 Watt TDP M3 Max right neck and neck with the 355 Watt desktop AMD 7900 XTX. Also looking at the TDP of the 4090 @ 450W gives some additional power performance perspective.. Of course those TDP numbers are solely for the GPU and don't include the TDP of the CPU (14900K) and motherboard, ram etc... where as the M3 Max shows the CPU, GPU, etc...

AMD Radeon RX 7900 XTX3974.152
Apple M3 Max (GPU - 40 cores)3743.321


Obviously quite a ways off of NVIDIA 4090

NVIDIA GeForce RTX 409010812.683
NVIDIA GeForce RTX 40808051.111
NVIDIA GeForce RTX 4070 Ti SUPER7660.981
NVIDIA GeForce RTX 4090 Laptop GPU5959.411
NVIDIA GeForce RTX 40705440.822
NVIDIA GeForce RTX 30904832.917
NVIDIA GeForce RTX 30804723.361
AMD Radeon RX 7900 XTX3974.152
Apple M3 Max (GPU - 40 cores)3743.321
NVIDIA GeForce RTX 30703217.163

On another forum I came under a lot of flack for saying that this actually is extraordinarily impressive when you consider that NVIDIA is using a heavily optimized path with Optix that typically benchmarks significantly faster than even CUDA.

What I was trying to highlight is that Apple has seen such a massive improvement over M1/M2 , no doubt from better code optimizations (still a ways to go looking at the Apple maintainer) but also from that new dynamic cache capability which gives developers are 'free' performance boost.

I think especially when you look at the cost of electricity (in Europe especially when I visit family in Ireland)..... I think it's very understandable why many, myself included are genuinely interested in powerefficient workstations!
I saw this on the tech testers Europe site twitter post here

Screenshot 2024-03-27 at 8.38.05 PM.png


Whilst a gaming cost breakdown to simplify the point that the juice that your computer drinks costs money, it really provides food for thought and it really makes me question where NVIDIA, AMD, Intel, can go from here especially with the need to be more responsible with climate action and climate change. Every little contribution helps.
 
Last edited:
Very interesting finding....

That puts the 78 Watt TDP M3 Max right neck and neck with the 355 Watt desktop AMD 7900 XTX and TDP of 450W for the 4090.
I'm a little confused the 7900 XTX yes, but the 4090 it's nowhere close unless I'm misunderstanding something about your post or the chart?

Of course those TDP numbers are solely for the GPU and don't include the TDP of the CPU (14900K) and motherboard, ram etc... where as the M3 Max shows the CPU, GPU, etc...

AMD Radeon RX 7900 XTX3974.152
Apple M3 Max (GPU - 40 cores)3743.321


Obviously quite a ways off of NVIDIA 4090
Is the first sentence that it is neck and neck with 4090 a typo then? Sorry if I'm just confused.

NVIDIA GeForce RTX 409010812.683
NVIDIA GeForce RTX 40808051.111
NVIDIA GeForce RTX 4070 Ti SUPER7660.981
NVIDIA GeForce RTX 4090 Laptop GPU5959.411
NVIDIA GeForce RTX 40705440.822
NVIDIA GeForce RTX 30904832.917
NVIDIA GeForce RTX 30804723.361
AMD Radeon RX 7900 XTX3974.152
Apple M3 Max (GPU - 40 cores)3743.321
NVIDIA GeForce RTX 30703217.163

On another forum I came under a lot of flack for saying that this actually is extraordinarily impressive when you consider that NVIDIA is using a heavily optimized path with Optix that typically benchmarks significantly faster than even CUDA.

What I was trying to highlight is that Apple has seen such a massive improvement over M1/M2 , no doubt from better code optimizations (still a ways to go looking at the Apple maintainer) but also from that new dynamic cache capability which gives developers are 'free' performance boost.

I think especially when you look at the cost of electricity (in Europe especially when I visit family in Ireland)..... I think it's very understandable why many, myself included are genuinely interested in powerefficient workstations!
I saw this on the tech testers Europe site twitter post here

View attachment 28833

Whilst a gaming cost breakdown to simplify the point that the juice that your computer drinks costs money, it really provides food for thought and it really makes me question where NVIDIA, AMD, Intel, can go from here especially with the need to be more responsible with climate action and climate change. Every little contribution helps.
 
Those numbers suggest that the 4090 scores about 24 points per watt, whilst the M3 scores 51 points per watt. That probably does matter to some people. Others just want the raw throughput of the 4090 and are happy to pay for it at the meter.
 

Oh so when MaxTech says it, it’s a big deal (and I was hardly the first even on these forums). 🙃

Let’s say they do this. If they follow the rough pattern set out by the M3 -> M3 Pro -> M3 Max, then the M3 Ultra might look something like this:

24 CPU cores (20 P-cores)
72-80 GPU cores

as the ratio of CPU/GPU cores for each die is 1.5x CPU cores (maintaining 4 E cores) and on average 2x GPU cores (1.8/2.2x).

This is similar to @Yoused ’s predictions in @Altaic ’s thread that something appears to be different about M3.


However @Cmaier ’s caution that such a die would be very large still applies.


Even if the reticle size is big enough, that’s an expensive chip.

Edit: we don’t know exact mm because there are no 3rd party die shots yet, but for reference, an M3 Max manufactured on N3 at 92 billion transistors has more transistors than a full Hopper GPU at 80 billion transistors on N4. Hopper has a die size of 814mm^2. The reticle limit for EUV is roughly 858 mm^2. While N3 came with density improvements for logic, some things like cache don’t scale that well. So an M3 Max is already massive. I’m not saying a monolithic M3 Ultra is impossible, but it would be a very, very big boy … and oh so close to that limit if it can be done at all.

Edit2: I tracked down an estimate (and it was only an estimate) that an M2 Max on N5P was 550mm^2. It had 67 billion transistors. So an M3 Max has 37% more transistors. For the M3 Ultra to fit in the reticle limit an M3 Max would probably have to be a smaller die than an M2 Max. The theoretical best shrink for pure logic from N5P to N3B is 60-70%. SRAM and I/O, each of which Apple has a lot, won’t shrink like that. Eoof. Maybe doable but tough and expensive. Someone estimated that it costs Nvidia $3300+ (not all of that is the die but still) to make an H100 which they sell for $30,000+. Even if it’s possible to make, I’m not sure a monolithic M3 Ultra is practical as much as I would like it to happen. I dunno, definitely not ruling it out, but don’t expect price cuts. If these estimates are right, it would roughly cost Apple as much to make a monolithic M3 Ultra SOC as they currently sell a Mac Studio Ultra for …

For the reasons outlined here I would put a monolithic Ultra that is the same size as a current Ultra as a very low probability as the die would be massive. Still possible of course but for these reasons I prefer the ideas of either a desktop Max chip or a new smaller “Ultra” base if we’re going to see changes with the new interconnect style allowing multiple dies dug up by Maynard Handley as another outlier possibility (more likely than the monolithic Ultra). To be fair he also includes the possibility of the monolithic Ultra not being just 2x Maxes.

An interesting tangent to this discussion is that I saw on a different post about how desktop Macs were only 10% of every Mac sold with the mini and studio being at around 1% and the tower at 3%. The iMac at 4%. Which I thought was pretty obviously hilariously wrong and would definitely be wrong if Apple is indeed actually developing unique SOCs for the desktop line. Don’t misunderstand that I agree that desktops are absolutely in the minority by a lot but at the very least the relative proportions of desktops make no sense and if any of the above is true (developing unique SOCs for desktops) which isn't guaranteed of course then even the relative proportions of desktops to laptops is likely underestimated. I remember @Cmaier being … unimpressed by market research numbers just trying to gauge the overall numbers of Macs, Dell, etc … sold in the past from IDC and the like. This seems even harder to estimate.
 
Last edited:
I'm a little confused the 7900 XTX yes, but the 4090 it's nowhere close unless I'm misunderstanding something about your post or the chart?


Is the first sentence that it is neck and neck with 4090 a typo then? Sorry if I'm just confused.
Sorry I typed it up late at night and I was tired. It a typo. I was not comparing the performance to the 4090 lol! But I edited the original post to correct what I was getting at…
namely , the performance on blender is close to 7900xtx as near as dammit. But there is a huge power discrepancy in favor of the M3 Max. When you look at the 4090, it also has amazing performance but at a huge TDP under load!
 
Those numbers suggest that the 4090 scores about 24 points per watt, whilst the M3 scores 51 points per watt. That probably does matter to some people. Others just want the raw throughput of the 4090 and are happy to pay for it at the meter.
Exactly - that was kind of my point. In europe running a big 14900K , 4090 , motherboard, ram etc…. under load it’s like boiling a kettle and electricity in ireland at least is NOT cheap!
We pay approx 38.63 cents per kilowatt or in US dollar terms… we pay approx $0.42 per kilowatt hour on a standard 12-24 month contract. Actual prices across Europe can vary significantly higher than this. On top of this we now pay carbon taxes and likely those costs will only go up.
I think this represents a huge opportunity to Qualcomm and Apple Silicon devices. (in my other thread, I eventually decided NOT to go with a desktop AMD rig after looking at the numbers and settled on a razer 14 with 8000 series ryzen and mobile 140TDP RTX4070 for light gaming work. That decision was heavily tempered by my electricity bill and not wanting it to be obnoxiously large!
 
Exactly - that was kind of my point. In europe running a big 14900K , 4090 , motherboard, ram etc…. under load it’s like boiling a kettle and electricity in ireland at least is NOT cheap!
We pay approx 38.63 cents per kilowatt or in US dollar terms… we pay approx $0.42 per kilowatt hour on a standard 12-24 month contract. Actual prices across Europe can vary significantly higher than this. On top of this we now pay carbon taxes and likely those costs will only go up.
Even if electricity was much cheaper, at some point this trend is going to run into a wall. >1kW of power is not an insignificant amount of power anymore. I can imagine having maximum power concerns (1kW is about half the maximum power allowed on the most basic home energy contracts here), maximum outlet power concerns (ie old houses using lower gauge wire for outlets, becoming a fire hazard by being heated too much), and even the sheer difficulty of extracting that amount of heat from a room in summer. There are electric heaters that use less than 1kW of power!
 

So I don’t think this defense of 8GB of RAM will have much bearing on Apple’s M4 plans. After all Apple is always defending itself right to the point where they suddenly course correct. But as aforementioned the issue isn’t that there aren’t users who can get away with 8GB of RAM. There absolutely are, but that limiting the base model of the base model to 8GB of RAM is quite limiting and charging users like they’re getting premium specs, especially on the base MacBook Pro model, when they aren’t and charging an arm and a leg for upgrades doesn’t create a healthy user experience when buying the product.

Doing things like keeping the base model at 8GB of RAM after 4 years is also short term-ism. Spending money (or reducing profits depending on your POV) to increase customer satisfaction and broaden the utility of their best selling product could have long term benefits especially as Apple tries to continue to grow its overall market share and expand Mac gaming. Apple Silicon was already a big boost to both but I see this kind of attitude as a drag on that. Then again, as I already stated, I don’t think this will continue to the next generation of devices. Apple always tends to hold on to base specs longer than they should and then defend that decision against all reason until they finally move on. No doubt we’ll see something similar play out here.
 
Very interesting finding....

That puts the 78 Watt TDP M3 Max right neck and neck with the 355 Watt desktop AMD 7900 XTX. Also looking at the TDP of the 4090 @ 450W gives some additional power performance perspective.. Of course those TDP numbers are solely for the GPU and don't include the TDP of the CPU (14900K) and motherboard, ram etc... where as the M3 Max shows the CPU, GPU, etc...

AMD Radeon RX 7900 XTX3974.152
Apple M3 Max (GPU - 40 cores)3743.321


Obviously quite a ways off of NVIDIA 4090

NVIDIA GeForce RTX 409010812.683
NVIDIA GeForce RTX 40808051.111
NVIDIA GeForce RTX 4070 Ti SUPER7660.981
NVIDIA GeForce RTX 4090 Laptop GPU5959.411
NVIDIA GeForce RTX 40705440.822
NVIDIA GeForce RTX 30904832.917
NVIDIA GeForce RTX 30804723.361
AMD Radeon RX 7900 XTX3974.152
Apple M3 Max (GPU - 40 cores)3743.321
NVIDIA GeForce RTX 30703217.163

On another forum I came under a lot of flack for saying that this actually is extraordinarily impressive when you consider that NVIDIA is using a heavily optimized path with Optix that typically benchmarks significantly faster than even CUDA.

What I was trying to highlight is that Apple has seen such a massive improvement over M1/M2 , no doubt from better code optimizations (still a ways to go looking at the Apple maintainer) but also from that new dynamic cache capability which gives developers are 'free' performance boost.

I think especially when you look at the cost of electricity (in Europe especially when I visit family in Ireland)..... I think it's very understandable why many, myself included are genuinely interested in powerefficient workstations!
I saw this on the tech testers Europe site twitter post here

View attachment 28833

Whilst a gaming cost breakdown to simplify the point that the juice that your computer drinks costs money, it really provides food for thought and it really makes me question where NVIDIA, AMD, Intel, can go from here especially with the need to be more responsible with climate action and climate change. Every little contribution helps.
The X post says this is based on a 60 W power difference between the 4070 and 7900. But where are they getting the 60 W difference? Is that the avg. power difference between the two over a range of different performance settings (resolution, frame rate, quality, etc.), when those setting are the same for the two GPUs? Though I suppose quality is subjective, since AMD and NVIDA have different optimizations.

OTOH, if people typically use the top quality settings the GPU can handle, then the better comparision would be power draw at top settings. But then you'd also need to figure in the difference in quality of experience if, say, the more power-hungry GPU also performs better.

What motivated the question is that when your work is task-focused (e.g., rendering) rather than experience-focused (e.g., gaming), you can't just look at wattage. Instead, you need to compare Whr to complete the task. [E.g, a system that uses twice the power but is three times as fast is actually more energy-efficient, assuming the idle draw is small enough not to affect this comparison.] So it's not just about the power, it's what you get out of that power.

I'm sure you know all this, but it's stll worth noting.
 
Last edited:
The X post says this is based on a 60 W power difference between the 4070 and 7900. But where are they getting the 60 W difference? Is that the avg. power difference between the two over a range of different performance settings (resolution, frame rate, quality, etc.), when those setting are the same for the two GPUs? Though I suppose quality is subjective, since AMD and NVIDA have different optimizations.

OTOH, if people typically use the top quality settings the GPU can handle, then the better comparision would be power draw at top settings. But then you'd also need to figure in the difference in quality of experience if, say, the more power-hungry GPU also performs better.

What motivated the question is that when your work is task-focused (e.g., rendering) rather than experience-focused (e.g., gaming), you can't just look at wattage. Instead, you need to compare Whr to complete the task. [E.g, a system that uses twice the power but is three times as fast is actually more energy-efficient, assuming the idle draw is small enough not to affect this comparison.] So it's not just about the power, it's what you get out of that power.

I'm sure you know all this, but it's stll worth noting.
Yes - it’s a great point :)
This is where I think the waters muddy. I’m assuming it’s under constant load tests that she is referencing such as gaming on cyberpunk or other such gaming ‘experiences’.
Obviously when looking at task based workloads like you mentioned, this is also where software optimizations and vendor optimized code paths come into play to reduce the amount of time to complete a task quicker also etc… so to your point, you have to look at power numbers in context of the task and how well both hardware and software are optimized for that particular hardware.
 
How did you find this gold?
The usual way…by following an increasingly baroque set of low follower accounts on Twitter/Mastodon, who somehow turn out to be incredibly important to Apple’s efforts with regard to communicating vital information. Rather than just making it easily available, they find it more fun to publish said information quietly, exactly 8 days after a solar eclipse. Get ready for the next drop!

I’m only half joking. Unlike Apple’s documentation, which is a complete joke. I have no idea how anyone would find it.

Ok, I got it from NatBro on Twitter.
 
Back
Top