Assasin’s Creed Shadows coming to the Mac day 1.

On M3 Ultra it seems to perform closer to base ps5 settings and frame rates than ps5 pro. For the full M3 Ultra that either feels like an underutilization of the hardware or Apple Silicon ray tracing is way below par for the general gpu compute power compared to the competition. I’d be curious to see the work distribution in a frame capture. See if tile memory is used properly or there’s memory stalls and how many ms are spent on tracing

the problem goes even further, Apple GPUs just don't have raw grunt for the GPU die size they occupy. M5 GPU μarch needs to increase the perf per core by at least 2x.

As for ray tracing in games its also poor. It looks Apple only made their RT units for 3D rendering like blender or redshift

A member of the other forum posted this from Reddit:
View attachment 34307

So it appears the game should have had a 30fps cap and that they are going to work on tweaking performance. I'll wait till either it is on sale or they do improve performance before I jump in. Which means I guess I'll be buying it twice...
Aye I strongly suspect this has more to do with drivers, porting, and software optimization rather than the raw hardware capabilities or ray tracing units that only work for blender. There’s a reason why most high profile AAA games come day 1 drivers and while this is a day 1 port, it’s almost certainly a port. We can see the history of console ports (the recent Spiderman game) where PC users complain why their 5090s run slower than a regular PS5 (exaggerating slightly for effect). To be blunt, that the game runs without constant crashes is almost a minor miracle. Performance will likely improve over the next 6 months.
 
Aye I strongly suspect this has more to do with drivers, porting, and software optimization rather than the raw hardware capabilities or ray tracing units that only work for blender. There’s a reason why most high profile AAA games come day 1 drivers and while this is a day 1 port, it’s almost certainly a port. We can see the history of console ports (the recent Spiderman game) where PC users complain why their 5090s run slower than a regular PS5 (exaggerating slightly for effect).
I agree that it’s a driver or optimisation issue. They did say specifically that it wasn’t a port. Of course that could be pr.
 
I agree that it’s a driver or optimisation issue. They did say specifically that it wasn’t a port. Of course that could be pr.
Fair but even if so the PC game is almost certainly better optimized - even driver issues for specific AAA games are often the result of those AAA games breaking DirectX spec to gain some performance advantage.
 
Fair but even if so the PC game is almost certainly better optimized - even driver issues for specific AAA games are often the result of those AAA games breaking DirectX spec to gain some performance advantage.
Oh I completely agree. There is no way they will have the knowledge or motivation to optimise for macOS in the way they will for x86 or console. That’s ok for me. We can’t go from 0-100 immediately. Promising to continue improving things is a good thing.
 
than the raw hardware capabilities
I do think this plays a huge part as well.

Looking Death stranding the M3 Ultra 80 core runs the game at 112fps at 4k. My 4070 Super power limited to 100watts runs at 130 fps on 1440p max settings.

The GPU core needs to become smaller but also more powerful which allows Apple to pack more cores at the higher end. Apple has chosen to not scale with power consumption, so the other option would be to scale with more cores.

This would be ideal for M5 GPU perf:

M5 needs to have the GPU performance of M4 Pro
M5 Pro 20 core needs to have the GPU performance of M4 Max 40 core
M5 Max needs to have double GPU perf of M5 Pro

roughly around there. Its possible and they also need to do this not just for gaming benefit but for AI it would make Macs much more compelling than just lots of VRAM.

1742602161252.png
 
I do think this plays a huge part as well.

Looking Death stranding the M3 Ultra 80 core runs the game at 112fps at 4k. My 4070 Super power limited to 100watts runs at 130 fps on 1440p max settings.

The GPU core needs to become smaller but also more powerful which allows Apple to pack more cores at the higher end. Apple has chosen to not scale with power consumption, so the other option would be to scale with more cores.

This would be ideal for M5 GPU perf:

M5 needs to have the GPU performance of M4 Pro
M5 Pro 20 core needs to have the GPU performance of M4 Max 40 core
M5 Max needs to have double GPU perf of M5 Pro

roughly around there. Its possible and they also need to do this not just for gaming benefit but for AI it would make Macs much more compelling than just lots of VRAM.

View attachment 34311
I think we all want more gpu performance. Let’s hope we get it with the M5!

I do think it’s fair to say that optimization both in terms of the game and drivers as @dada_dave said above, play a big role in performance. We’ve seen time after time how a game comes out, performs poorly on AMD or Nvidia and after a new driver, or some work by the game devs, performs much better. I simply don’t believe game devs have the knowledge and motivation to tune their game for macOS as they do for Windows on AMD/Nvidia.
 
Last edited:
AMD has a new driver that improves performance for AC Shadows. Why can’t Apple do the same thing?
Apple's not going to put game/software dependant fast paths/workarounds in their drivers.
They'll fix bugs. But going down that path would be a terrible idea.

AMD/Nvidia drivers are a mess in that regard.

Fix/optimise the game code. Not 'tweak the drivers for the broken game'.
 
Apple's not going to put game/software dependant fast paths/workarounds in their drivers.
They'll fix bugs. But going down that path would be a terrible idea.

AMD/Nvidia drivers are a mess in that regard.

Fix/optimise the game code. Not 'tweak the drivers for the broken game'.
A return to the days where they code for specific hardware platforms instead of coding to a hardware agnostic API? Seems like that would be the only way to fix things on PC. macOS shouldn't suffer the same problem since all the hardware is "the same", but clearly this game is doing something that just doesn't work well on macOS. Does Xcode allow you to profile already compiled applications (something like Nvidia Nsight equivalent)? It would be really interesting to see what the game is doing that affords it minimal framerate increases for setting changes.
 
macOS or rather Mac platform is still frangmented. A true "all the hardware is the same" would be consoles.
Fair. Are there things you can do (aside from RT/Mesh) that run fine on M3/M4 GPU but don't work on older GPU's?

I just can't figure out how a 1660 Super is getting better frames than a M1/2 Max.
 
Regarding capturing Metal frames for analysis; I tried, but I repeatedly get this error every time I try:

View attachment 34319and then the app beachballs until I force quit it - It really doesn't like me trying to capture a frame or command buffer for analysis
Damn that’s a shame. As someone unfamiliar with frame capture and Xcode, is it usually possible to capture game data from App Store games?
 
macOS or rather Mac platform is still frangmented. A true "all the hardware is the same" would be consoles.

Fair. Are there things you can do (aside from RT/Mesh) that run fine on M3/M4 GPU but don't work on older GPU's?

I just can't figure out how a 1660 Super is getting better frames than a M1/2 Max.
Even consoles these days aren’t all the same: eg PS 5 Pro and XBox S. And while Apple is more fragmented than consoles of course, more generations in the same number of years (and more variety within generations), in theory it would still be easier to optimize for the more limited variations in hardware than on the PC side. In practice, obviously devs have a long way to go.

I do think this plays a huge part as well.

Looking Death stranding the M3 Ultra 80 core runs the game at 112fps at 4k. My 4070 Super power limited to 100watts runs at 130 fps on 1440p max settings.

The GPU core needs to become smaller but also more powerful which allows Apple to pack more cores at the higher end. Apple has chosen to not scale with power consumption, so the other option would be to scale with more cores.

This would be ideal for M5 GPU perf:

M5 needs to have the GPU performance of M4 Pro
M5 Pro 20 core needs to have the GPU performance of M4 Max 40 core
M5 Max needs to have double GPU perf of M5 Pro

roughly around there. Its possible and they also need to do this not just for gaming benefit but for AI it would make Macs much more compelling than just lots of VRAM.

View attachment 34311
Again doesn’t really say much because we know what the graphics hardware can achieve from other benchmarks. Given what we’re seeing on the PC side as well, the performance issues are almost certainly a product of poor optimizations for Mac. That multiple games suffer from this is unsurprising.

It’s the reverse for Android/iOS. Flagship Android chips have had competitive if not more powerful GPUs for multiple generations but gaming performance has lagged behind (note: not talking about Qualcomm's compute issues which might contribute for that particular GPU, but probably not a lot). iOS is where the money is though and game optimizations follow suit (again Apple hardware is also more homogeneous) giving the iPhones an edge when measured on real games versus graphics benchmarks. It’s why graphics benchmarks are simultaneously incredibly useful and yet also useless. They can demonstrate what hardware is capable of in theory but that theory doesn’t always translate into practice. (Also some benchmarks are too short if cooling is an issue for a device but that’s a separate issue and can even be true for game benchmarks, though less often).

Be that as it may that I don’t disagree, in fact heartily agree, that Apple could use about a ~20% boost in raw hardware GPU performance, either by doubling FP32 units per core (Nvidia gets a 20-30% boost from doing this, Apple with a different architecture might get a different boost) or simply ~20% more cores.

I’m not sure about the extent to which Apple should go down the rabbit hole of game-specific driver optimizations. This isn't an issue for 90%+ of games but for the tent pole AAA games it is how game developers/Nvidia and AMD get additional gaming performance. But it is also why those drivers can be a bloated mess. Basically unless Apple is willing to do the same, they will always lag in game performance, but doing so will result in much more expensive to write and even more expensive to maintain drivers and potentially buggier experiences for users.
 
Last edited:
Damn that’s a shame. As someone unfamiliar with frame capture and Xcode, is it usually possible to capture game data from App Store games?
I don’t actually know. First time I try with something from the Mac App Store. I tried modifying its Info.plist file. It didn’t like that at all and I have to reinstall it now. But I still have some ideas to try to get some frame data out of this thing :)
 
Funny that one of the Metal fragment shaders (at least one - probably a lot more) reference FSR

1742726996988.png


Game uses MetalFX upscaling but there's still reference to FSRUpscaling. - I assume they've just taken the names of their DirectX/Vulcan/whatever shaders and ported straight over as much as possible so it's just a leftover of that but it's curious regardless.

Also, if you snoop around the lproj folders (for translations), the non-English translation files are normal translation files. Saying things like "Quit Application" = "Luk Program" or whatever - But the English translations file is a log of generating the other files. Which shows the directory structure they had on their internal tools generating the files.

Then they also have a MacHardwareRequirements.json file that says M1, M2 do not meet minimum system requirements and has associated graphics presets for not just each type of chip, but each machine that has any given chip (although it seems with a quick glance it is still all bundled per chip so all M4 has the same contents, just using Mac identifiers for more context).
 
Funny that one of the Metal fragment shaders (at least one - probably a lot more) reference FSR

View attachment 34320

Game uses MetalFX upscaling but there's still reference to FSRUpscaling. - I assume they've just taken the names of their DirectX/Vulcan/whatever shaders and ported straight over as much as possible so it's just a leftover of that but it's curious regardless.

Also, if you snoop around the lproj folders (for translations), the non-English translation files are normal translation files. Saying things like "Quit Application" = "Luk Program" or whatever - But the English translations file is a log of generating the other files. Which shows the directory structure they had on their internal tools generating the files.

Then they also have a MacHardwareRequirements.json file that says M1, M2 do not meet minimum system requirements and has associated graphics presets for not just each type of chip, but each machine that has any given chip (although it seems with a quick glance it is still all bundled per chip so all M4 has the same contents, just using Mac identifiers for more context).
You think they just ran the code through the shader converter and called it a day? I recall seeing somewhere that the PC version running through crossover gets about the same performance as the native version (or it isn't any better).
 
Back
Top