M4 Mac Announcements

What did you think it would be?
M4 Max at 1440p60fps with RT set to medium. I guess they were too high of expectations. The GPU is really something Apple should spend very high R&D on. Its important to so many of their future product stack and they depend on a good GPU core.
 
The price also hurts, the higher SKUs ie higher GPU core counts of Max/Ultra don't reflect the performance small % you gain.
 
M4 Max at 1440p60fps with RT set to medium. I guess they were too high of expectations. The GPU is really something Apple should spend very high R&D on. Its important to so many of their future product stack and they depend on a good GPU core.
I don’t think the gpu core is a problem. They simply aren’t going to prioritise performance over efficiency. So I get it, for those wanting more of performance, especially on the desktop, it’s disappointing.
 
I thought all Metal API games were TBDR. You have to do something special for that to be the case?

I have to admit I’m confused by this as well. As far as I can tell, yes the GPU is always trying to leverage the TBDR capability and avoid rendering objects it doesn’t have to, but unless the graphics engine is properly tuned, the GPU may not be able to do so effectively. I have little personal knowledge and zero experience so I’m not sure why, but I’ve seen multiple people talking about TBDR as though some work is required by the developers to make it useful. I know some GPUs like Qualcomm even come with a switch that lets the GPU change rendering modes (including one that sounds like TBDR) so that developers can use the one that suits their engine. One could simply say that this just means there are tradeoffs for different rendering modes for different types of scenes (and there are), but taken together it seems like an engine itself can be better or worse for TBDR.
So the hardware is always TBDR. In rendering targeting the pre-Apple Silicon SDKs of macOS (I believe that would be Catalina and earlier off top my head), the driver will do extra work to emulate immediate mode rendering to not cause rendering artefacts from incorrect assumptions about undefined behaviour in the APIs. At least this used to be the case, not sure if that's been removed.

Application code can also of course manually do the work to emulate immediate mode rendering on TBDR hardware while linking against newer SDKs although I doubt any go through the effort of doing that.

The big difference though just comes down to how much you make use of the benefits of TBDR. - Do you make efficient use of tile memory or do you always round trip to shared memory for example, and how are your shader pipelines set up? - You can lean into and optimise for the architectural tradeoffs to greater or lesser extents - It's not an all or nothing game but a wide spectrum as well
 
I don’t think the gpu core is a problem. They simply aren’t going to prioritise performance over efficiency. So I get it, for those wanting more of performance, especially on the desktop, it’s disappointing.
It’s not an efficiency problem. Yes, Apple GPU cores are using high density libraries but they are also bloated.

The GPU is massive part of M4 Max and yet the performance doesn’t reflect the cost or die area. The Max needs to have 80 GPU cores and obviously that’s not possible because a) they are too big and b) Apple needs to move to tiles and add a massive GPU if it won’t make its GPU cores more PPA friendly.
 
It’s not an efficiency problem. Yes, Apple GPU cores are using high density libraries but they are also bloated.

The GPU is massive part of M4 Max and yet the performance doesn’t reflect the cost or die area. The Max needs to have 80 GPU cores and obviously that’s not possible because a) they are too big and b) Apple needs to move to tiles and add a massive GPU if it won’t make its GPU cores more PPA friendly.
Yeah definitely don’t agree with this overall take. The max does perform well on a variety of tasks.
 
You gamers just don’t get it, do you? Apple’s SoC and GPU strategy is designed to meet the needs of 99% of their user base: that means efficiency and battery life. Nobody gives a fuck about Cyberpunk.
 
You gamers just don’t get it, do you? Apple’s SoC and GPU strategy is designed to meet the needs of 99% of their user base: that means efficiency and battery life. Nobody gives a fuck about Cyberpunk.
I mean Apple seems to give a bit of a fuck. Hence the promotion at WWDC etc. I think there’s a sensible middle ground between games don’t matter and gaming is all that matters.
 
Apple cares about the PR goodwill. And those same MacBook buyers who value efficiency and battery life may like to play some AAA games once in a while, but they will happily do so only at the low settings Apple’s SoCs allow.

It’s the gamer mindset that Apple has to match Nvidia in gaming specs or the company is doomed/M-series CPUs suck that drives me nuts.
 
Apple cares about the PR goodwill. And those same MacBook buyers who value efficiency and battery life may like to play some AAA games once in a while, but they will happily do so only at the low settings Apple’s SoCs allow.

It’s the gamer mindset that Apple has to match Nvidia in gaming specs or the company is doomed/M-series CPUs suck that drives me nuts.
we have a technical discussion, that’s it here. The people (including myself) in this thread like Apple products and want them to be the best. There is nothing wrong with comparing with alternatives out there in my opinion.

It’s good to educate people on the fact the M3 Ultra delivers about the same performance in cyberpunk as a RX 9060 XT, a $350 GPU. That’s fine, Apple won’t die if its computers won’t run Cyberpunk at the performance I want, there are better alternatives for that.
 
If I’m reading this correctly, it doesn’t look like the max is too different to the 4070 laptop in cp2077. Both get ~60fps qhd with upscaling. No need for panic! https://www.notebookcheck.net/NVIDIA-GeForce-RTX-4070-Laptop-GPU-vs-M4-Max-40-Core-GPU_11453_12886.247598.0.html#:~:text=NVIDIA,-GeForce RTX 4070 Laptop GPU:24.8
Aye one issue is that it’s not clear to me which quality tier “very high fidelity” is - is that the same as ultra? The PC side is low medium high ultra. So what’s very high on the Mac? If it’s not the same, how big a quality difference is it?

Hopefully reviewers will be able to manually set the Mac and PC to be identical and we’ll get a much better idea of what the performance actually is. Also curious about performance relative to Wine.
 
Last edited:
It’s not an efficiency problem. Yes, Apple GPU cores are using high density libraries but they are also bloated.

The GPU is massive part of M4 Max and yet the performance doesn’t reflect the cost or die area. The Max needs to have 80 GPU cores and obviously that’s not possible because a) they are too big and b) Apple needs to move to tiles and add a massive GPU if it won’t make its GPU cores more PPA friendly.
Apple is the industry leader in GPU perf/watt and it's not close. Power is one of the two "P"s in PPA, so claiming that Apple's GPU design isn't PPA friendly is wrong.

The thing you seem to be missing is that there actually is an engineering tradeoff buried in "PPA": there's no such thing as maximizing a GPU's performance and power efficiency at the same time.

Every time you pay some die area to put down an ALU, you can extract a variable amount of performance from it based on clock speed. Performance is roughly linear with clock speed (provided that the memory subsystem can keep up), but thanks to physics, clock speed is not linear with power. Adding 1 unit of performance always costs more power than adding the previous unit did.

Basically, Apple chooses to design GPU cores targeted at lower, more power efficient frequencies than NVidia does. This directly results in Apple GPUs using more area to hit the same performance target.
 
Aye one issue is that it’s not clear to me which quality tier “very high fidelity” is - is that the same as ultra? The PC side is low medium high ultra. So what’s very high on the Mac? If it’s not the same, how big a quality difference is it?

Hopefully reviewers will be able to manually set the Mac and PC to be identical and we’ll get a much better idea of what the performance actually is. Also curious about performance relative to Wine.
As far as I can tell it isn't a 1 size preset. So on a M1 it could have medium textures, but low shadows and lighting, but on a M4 it could have medium textures and medium shadows and lighting. What is interesting is none of the "For This Mac" preset enable RT, where as on PC I am pretty sure it auto enables at least "regular" ray tracing if it detects you have an nvidia RTX GPU (I am not sure if it auto selects the RT preset on AMD cards).

That means when folks compare to PC they are going to have to choose one of the "PC" presets, or denote what settings are set by the "For This Mac" one so the comparison can be equal.
 
It’s a few days after release, so I thought it would be good to look at some of the scores for Cyberpunk 2077 to see how things compare with Windows and also GPTK. Except….

There seems to be a bug/misconfiguration in the Mac version of the game. I first read about it on Reddit. It seems that whatever preset is chosen (except the “For this Mac” ones), the setting for Screen Space Reflections (SSR) is +1 on the Mac. So the Ultra preset has a SSR setting of Ultra on Windows, but “Psycho” on macOS. Whoops! Unfortunately Ultra -> Psycho is massive drop in performance. I tried a variety of resolution at ultra both with and without the correct setting for SSR.

SettingSSR = “Psycho"SSR = UltraPercentage change.
4K Ultra9.1514.0453%
1440p Ultra21.0730.8747%
1080p Ultra37.0349.9435%
720p Ultra 70.2185.3122%

These figures are from a M4 Pro (20 core gpu). I think we can see that if not for this bug, early performance reviews would be much better. I don’t have a M4 Max, but I would imagine it would perform like a 4070m, rather than around a 4060m as it currently does. Not bad I suppose, not great either.

Ray tracing is another matter. The performance is bad. This above bug exists in RT presets as well. In fact all RT presets have SSR set at “Psycho”. My machine is too slow to notice a difference with or without an incorrect SSR setting.

I am very curious why the M3/M4 machines do pretty well on Blender/Cinebench R24, but so poorly on Assassins Creed Shadows and Cyberpunk 2077 in terms of RT. Can it just be optimization? Apple helped quite a bit with this port, so that seems unlikely.
 
Back
Top