M1 Pro/Max - additional information

Hmm. So macos favors keeping 1 p-cluster full before going to the second p cluster for anything?

Definitely seems to be the case from what I've observed/heard elsewhere.

Seems that it heavily prioritises the first four P cores until load reaches a certain percent.
 
Hmm. So macos favors keeping 1 p-cluster full before going to the second p cluster for anything? I wonder if that’s so they can flip back and forth between the two in order to reduce hot spots, or a power conservation move - once you start using a cluster for something you have to keep it powered, so no point powering up the second unless you need it. (At the coarsest level, it takes multiple cycles to power up a block, so you don’t want to be flipping it off and on needlessly).
IIUC, each cluster shares a L2, so filling up one may be more power-efficient than, say, running 2 in one cluster and 2 in the other. If very high raw performance, rather than efficiency, is what you are after, it might be more effective to use clusters sparsely so that each core has more L2 to work with – although, that might depend on how memory-intensive the work is.
 
Hmm. So macos favors keeping 1 p-cluster full before going to the second p cluster for anything? I wonder if that’s so they can flip back and forth between the two in order to reduce hot spots, or a power conservation move - once you start using a cluster for something you have to keep it powered, so no point powering up the second unless you need it. (At the coarsest level, it takes multiple cycles to power up a block, so you don’t want to be flipping it off and on needlessly).
I got my M1 Max a few days ago. I've done a little powermetrics-watching while starting/stopping CPU eater processes. The two performance clusters are P0 and P1, and it sure looks like macOS is optimizing for power. With the system idle P1 spends a lot of time power gated (0 mW). (Note: to see this you may need to use powermetrics' -i option to set a shorter sampling interval, P1 does get woken every so often and the longer the sampling interval the less likely it is that it averaged 0mW over a whole interval.)

I can start up one, two, three, or four CPU eaters on an otherwise idle machine and I never see P1 go beyond mostly-off until the fourth eater is running.

(edit: to clarify, when you start the fourth process, P1 doesn't go full active, it just gets woken up a bit more since presumably there's a higher chance of system threads occupying a few cores pushing the scheduler to want more than 6 cores active. P1 doesn't stay on consistently unless I start a fifth CPU eater.)
 
I got my M1 Max a few days ago. I've done a little powermetrics-watching while starting/stopping CPU eater processes. The two performance clusters are P0 and P1, and it sure looks like macOS is optimizing for power. With the system idle P1 spends a lot of time power gated (0 mW). (Note: to see this you may need to use powermetrics' -i option to set a shorter sampling interval, P1 does get woken every so often and the longer the sampling interval the less likely it is that it averaged 0mW over a whole interval.)

I can start up one, two, three, or four CPU eaters on an otherwise idle machine and I never see P1 go beyond mostly-off until the fourth eater is running.

(edit: to clarify, when you start the fourth process, P1 doesn't go full active, it just gets woken up a bit more since presumably there's a higher chance of system threads occupying a few cores pushing the scheduler to want more than 6 cores active. P1 doesn't stay on consistently unless I start a fifth CPU eater.)
Fascinating. Intels strategy, at least back when I followed such things, was to spread things out to avoid overheating.
 
Maybe you only get pushed into that when individual cores need lots more than 5W @ Fmax.
I was trying to calculate the actual active channel density in M1 the other day, but I couldn’t get the math to work (it requires a bunch of guesses as to the average N-channel area.). I suspect it’s lower than Intel, possibly due to some 1-of-N routing stuff that they may be doing based on what they got from the Intrinsity folks, but who knows.
 
Hmm. So macos favors keeping 1 p-cluster full before going to the second p cluster for anything? I wonder if that’s so they can flip back and forth between the two in order to reduce hot spots, or a power conservation move - once you start using a cluster for something you have to keep it powered, so no point powering up the second unless you need it. (At the coarsest level, it takes multiple cycles to power up a block, so you don’t want to be flipping it off and on needlessly).

I keep going back to ecleticlight, since it's a great way to confirm if weird stuff I see is accurate or not: https://eclecticlight.co/2021/11/04/m1-pro-first-impressions-2-core-management-and-cpu-performance/
 
Back
Top