Apple’s GPU and power measurement.

Jimmyjames

Elite Member
Joined
Jul 13, 2022
Posts
1,482
This video has been shared fairly widely and I thought it might make for an interesting discussion. I’ll preface this thread by stating that much of this is over my head and I”m trying to understand it, so I may be incorrect in my statements.
Here is the video:


Here is a screenshot from the conclusion:
1773628951194.png


Essentially, the video posits the idea that Apple’s power measurement tools and apis are incomplete and based on a model rather than measurements. The author noticed a discrepancy between wall power measurement and software measurement. It amounted to a significant power difference. After some analysis, they say that Apple’s model doesn’t account for data movement when stating GPU power. It seems that data movement is increasingly expensive with modern GPUs.

My uneducated question is: does that make sense as a critique of the power measurement that Apple provides? That is to say, I wonder if part of the problem is that in a UMA like Apple Silicon, delineating what constitutes "gpu" power or "cpu" power is a judgement. Certainly the cost of moving data is important in terms of overall power consumption, but isn't it just as likely that Apple judges gpu power as only the alu/computation that takes place within the gpu itself? If so, while Apple should do a better job of accounting for residual power, is it clearly part of a GPU power measurement?

I would value input on this as I stated, much of this is currently beyond my knowledge.

Edit. It seems that the video has to be viewed on YouTube directly. The creator having disabled viewing elsewhere. Annoying.
 
This video has been shared fairly widely and I thought it might make for an interesting discussion. I’ll preface this thread by stating that much of this is over my head and I”m trying to understand it, so I may be incorrect in my statements.
Here is the video:


Here is a screenshot from the conclusion:
View attachment 38399

Essentially, the video posits the idea that Apple’s power measurement tools and apis are incomplete and based on a model rather than measurements. The author noticed a discrepancy between wall power measurement and software measurement. It amounted to a significant power difference. After some analysis, they say that Apple’s model doesn’t account for data movement when stating GPU power. It seems that data movement is increasingly expensive with modern GPUs.

My uneducated question is: does that make sense as a critique of the power measurement that Apple provides? That is to say, I wonder if part of the problem is that in a UMA like Apple Silicon, delineating what constitutes "gpu" power or "cpu" power is a judgement. Certainly the cost of moving data is important in terms of overall power consumption, but isn't it just as likely that Apple judges gpu power as only the alu/computation that takes place within the gpu itself? If so, while Apple should do a better job of accounting for residual power, is it clearly part of a GPU power measurement?

I would value input on this as I stated, much of this is currently beyond my knowledge.

Edit. It seems that the video has to be viewed on YouTube directly. The creator having disabled viewing elsewhere. Annoying.

In the end, any benchmark (power or performance) only tells you about that metric with respect to doing “a thing.” The “thing” is arbitrary, but is only of value if it coincides with the thing the consumer of the benchmark wants to do.

Since there is no consensus about what should be measured, the really important thing is that Apple be consistent across generations, etc., so that a user who has product X and is considering purchasing product Y can make a decision.

On the engineering side, everything is modeled. If I’m comparing one multiplier to another, I extract the values of all the parasitic capacitances, plug that into a formula with the frequency and voltage, and produce a worst-case number that i use for comparison. Perhaps I get fancy and, for blocks where wires may switch with different frequencies (i.e. some switch every cycle, some don’t, etc.) I take all that into account. I use this for all sorts of things, like comparing two possible designs, figuring out if there are transistors that I can shrink to reduce power and yet meet timing requirements, etc. I tend to do this at the block and sub block level. This generation’s floating point unit vs. last generations. The multiplier with the floating point unit. Etc.

What I don’t do is gin up some particular cross-block operation and figure out what the power of that particular operation looks like.

On the marketing side, it’s all arbitrary, and the only golden rule is consistency.
 
This video has been shared fairly widely and I thought it might make for an interesting discussion.
I'm not quite sure what to make of it. I watched part of it and kept having to skip forward through lots of meaningless segments where he says nothing at great length. When he did get down to things he thinks he's proved, it didn't taste like proof at all - seemed to be curve fitting (pick arbitrary parameters that make a curve look like you want it to) without adequate attention given to ruling out other potential hypotheses for why the measured curves look the way they do.

I'd really like to know if the creator used AI to generate some or all of the script. The whole thing has that hallmark over-verbosity and vague sense of plausibility which might collapse if you look at it too hard. But I haven't looked at it hard enough to figure out whether it really collapses. That's my take, for what it's worth.
 
Powermetrics does not report RAM, cache, or other uncore power usage. I can imagine that it can be significant at times. Interestingly enough, early versions back in the day did include DRAM power indicator, but it was removed in a mscOS update.
 
I don't know about the specifics in the video - too tired, I'm not able to focus on watching it - but if you do powermetrics -h, Apple specifically notes that (emphasis mine):

The tool may also display estimated power consumed by various SoC subsystems, such as CPU, GPU, ANE (Apple Neural Engine).
Note: Average power values reported by powermetrics are estimated and may be inaccurate - hence they should not be used for any comparison between devices, but can be used to help optimize apps for energy efficiency.

Now whether or not average power values refers specifically to --poweravg or any of the reported power measures I think isn't really that important. Apple itself says powermetrics is an estimate, a model if you will, of power usage. I believe the same is true HWinfo and basically any software measurement of power. I think Andrei F. also made that point a while ago. That doesn't mean it can't be super interesting data, I think it is, but just like wall power can't really tell you about what's going on under the hood (beyond high idle power), powermetrics/HWinfo can also need outside data (e.g. wall power or more complex power measurement schemes) to help contextualize results.
 
Now whether or not average power values refers specifically to --poweravg or any of the reported power measures I think isn't really that important. Apple itself says powermetrics is an estimate, a model if you will, of power usage. I believe the same is true HWinfo and basically any software measurement of power. I think Andrei F. also made that point a while ago. That doesn't mean it can't be super interesting data, I think it is, but just like wall power can't really tell you about what's going on under the hood (beyond high idle power), powermetrics/HWinfo can also need outside data (e.g. wall power or more complex power measurement schemes) to help contextualize results.

In addition, the very terms like "CPU power" or "GPU power" are not as easy to define as one might think. Is that the power used by the cores proper (and how do you define that?), by the core complex, by the core complex + on-chip data bus, or is that the system power that needs to be spent to keep the unit operational as intended? Is DRAM power during a trivial GPU data copy kernel "GPU power"? And even if we say it is, what does it tell us about overall power consumption during more typical workloads?

Edit: so overall @dada_dave's position that we should be using wall power and measuring total workload power/energy rather than hypothetical "CPU/GPU power" is the most sensible one.
 
Last edited:
Back
Top