I think you do have it backwards. In the GB6 benchmarks which have limited multi-core scaling, on Apple SoCs there won't be enough work to fill all the compute capacity of the P cores. The E cores probably don't get involved. They might contribute somewhat on Apple's 4+4 designs, but my impression is 6 or more performance cores means the E cores aren't getting much of a workout. (Although do note that not all GB6 MT benchmarks are like this - it still has some which scale.)Another thing to consider is that GB6 MC score has a nonlinear relationship with core count. I would’ve thought Apple’s heterogeneous design would have suffered more from GB6’s new approach but it is possible I have it reversed.
It's Intel style designs which suffer the most. i9-14900K chips need 32 compute threads (!!!) to maximize multithreaded throughput. (It's an 8P+16E config, but the P cores are hyperthreaded and require 2 threads each to achieve maximum compute throughput.) If that chip sits there with maybe 4 or 5 cores utilized on a GB6 benchmark which just doesn't make effective use of more than that, most of its MT throughput is just idle.
(I actually think this is a good idea on GB's part, since the idea that your average PC gaming enthusiast needs a 32-thread monster to play games is ridiculous. Very little of the software enthusiasts run scales well with such high core counts, but MT throughput potential gets a disproportionate amount of press because number must go up every year to sell upgrades.)