M4 Mac Announcements

Is it sufficient to do the needed modificiations on a per-core basis, or does this require global changes to the chip? E.g., would you need to increase the voltage across the entire chip to enable boosting on a single core?

I ask because I'm wondering if you could create one or two super-P-cores, capable of running at very high clocks, into the Pro and Max chips, without having to modify the chip as a whole. If so, this wouldn't be that costly.

Since most apps continue to be single-threaded (including CPU-demanding apps like Mathematica, Maya, and AutoCAD) (are there also significantly CPU-sensitive games?), even boosting just a single core can have significant practical value. How much you enable it could be device-dependent, providing further product differentiation. E.g., maybe moderate boost on the Pro Mini and Pro/Max MBP's, and high boost on the Studios.

And separately, what about boosting the all-core clocks on the GPU's--what considerations would apply there?

You can put different cores (or even parts of cores) in different clock/voltage domains. The problem is the communication between blocks in different clock domains. If I compute twice as fast, I have to queue up my communications to slower blocks (e.g. slower cores, RAM, GPUs, whatever). Messages sent to those other blocks have to wait somewhere. This is usually done using first-in-first-out buffers that hang on to messages until the slower recipient is ready to receive them. But these buffers might need to get pretty large. And you get into situations where a message gets stale before it’s even received. (“set memory location XXX to YYY. No, never mind that. Now set it to ZZZ!”). The trick isn’t so much making a core run faster, but making it so that speedy-core can communicate with slow-core.

The reason that happens is, of course, that even if you can speed up the core, that doesn’t mean you can speed up the RAM, or the GPU, or various other blocks.
 
Apple processors have cluster-based clock domains. The P cores can run at 4.5GHz, until too many of them are running in parallel, at which clock speed ramps downward. Part of the reason for that is the limitations of the memory system. If high-speed cores are experiencing data starvation (including code itself, which has to come from memory). Having cores spin while waiting for data wastes energy, so the SoC's clock is reduced to make the system energy efficient.

The E cores cannot run at 4.5 – more like 3.0 or less – because they are not built handle that (much smaller reorder buffers and rename files). And, of course, the GPU, NPU and other ASICs are competing for memory bandwidth at least some of the time. Overclocking may have made sense 15 years ago, when we had simple CPUs instead of SoCs, but today it does not.
 
Apple processors have cluster-based clock domains. The P cores can run at 4.5GHz, until too many of them are running in parallel, at which clock speed ramps downward. Part of the reason for that is the limitations of the memory system. If high-speed cores are experiencing data starvation (including code itself, which has to come from memory). Having cores spin while waiting for data wastes energy, so the SoC's clock is reduced to make the system energy efficient.

The E cores cannot run at 4.5 – more like 3.0 or less – because they are not built handle that (much smaller reorder buffers and rename files). And, of course, the GPU, NPU and other ASICs are competing for memory bandwidth at least some of the time. Overclocking may have made sense 15 years ago, when we had simple CPUs instead of SoCs, but today it does not.
But we're not talking about overclocking, which is clocking beyond the top speed set by the manufacturer. Instead, we're simply exploring the possibility of Apple adjusting its clock speed structure—where, as you said, there already exists a difference between the max single-core and all-core clocks on the P-cores—to one where the differential is greater. [To use Intel's parlance, we're exploring the possibility of Apple offering a higher single-core "turbo boost".]

Thus, what we're discussing is simply a difference in degree in what Apple already does, not a qualitative departure.
 
Back
Top