This shows the scaling of the GB6 GPU scores with Metal and OpenCL for both the M1 and M2 series. To make the scaling behavior clearer, I normalized them so the M# processor falls on the "Perfect Scaling" line. I took the values from the GB6 benchmark charts on Primate's site, except for the M2 Ultra's Metal score, where I used 223,000 (taken from the search results) instead of 281,948. Given the close match I get to the OpenCL scaling when I use the former, that seems to have been a reasonable choice.
It can be seen that the Max→Ultra scaling is indeed much better with the M2 than the M1.
Overall, from M2 → M2 Ultra, the slope of score vs. core count is ~0.65 (perfect scaling would be 1). By comparison, NVIDIA's GB6 Open CL score scaling with ALU's x GHz, for the RTX 3050 → RTX 3090 Ti, is somewhat better: ~0.72. [I've not checked the 4000 series.]
[ATTACH=full]24283[/ATTACH]
Here's a comparison of the M1, M2 and RTX 3000 series for GB 6 Open CL score vs. calculated FP 32 FMA TFLOPS. While the M2 Ultra's calculated TFLOPS are half-way between those of a 3070Ti and a 3080, its GB6 Open CL score is comparable to a 3070's.
[The nine NVIDIA GPU's show here are: 3050, 3060, 3060 Ti, 3070, 3070 Ti, 3080, 3080 Ti, 3090, 3090 Ti.]
[ATTACH=full]24284[/ATTACH]