Nuvia: don’t hold your breath

Basically there are four tiers of RAM we see most commonly afaict and will see in the future, with the top one beginning to die out and being replaced by the second one.

DDR SODIMM
LPCAMM2 LPDDR
LPPDR modules on a PCB (most common in laptops basically currently, excluding Apple)
LPDDR on package - everything from phones to tablets to Apple laptops and now even Qualcomm and Intel will be doing this for low power PC parts and IIRC Strix Halo does this too.
 
New report from MS and some others about the NPU advantage. Has some figures of X Elite vs M3 NPU. I must admit I hadn’t seen joules used in this way before.
1716235178364.png
 
New report from MS and some others about the NPU advantage. Has some figures of X Elite vs M3 NPU. I must admit I hadn’t seen joules used in this way before.
View attachment 29520

Anandtech used to report Joules if I remember right - on occasion anyway. I’m fascinated that the time to completion is similar but the energy of the Qualcomm NPU is far lower than the Apple NPU. The only thing I can think of is that the cores are much bigger or more numerous but slower on the Qualcomm NPU as compared to the Apple one - much like a big GPU clocked lower or even a big CPU with lots of cores clocked lower will be more efficient than a smaller GPU/CPU clocked higher for the same work. Given how little information anyone gives about their NPUs the only information we’ll probably get is die shots.
 
I just asked the author and he confirmed that the X Elite allows you to target the NPU on Windows whereas the M3 doesn’t and so the results for the M3 are NPU plus CPU, which would account for the energy difference I would imagine.

Macos really needs an api to target the NPU exclusively.
 
I just asked the author and he confirmed that the X Elite allows you to target the NPU on Windows whereas the M3 doesn’t and so the results for the M3 are NPU plus CPU, which would account for the energy difference I would imagine.
Ah yeah okay that makes sense too.

Macos really needs an api to target the NPU exclusively.
Yeah the default for CoreML could be "run anywhere" while allowing you to specify where the workload should run. I mean I get that Apple doesn't let you pin processes to cores or even a type of core but you can make the argument that choosing what kind of accelerator to run a workload on is a slightly bigger deal.
 
Ah yeah okay that makes sense too.


Yeah the default for CoreML could be "run anywhere" while allowing you to specify where the workload should run. I mean I get that Apple doesn't let you pin processes to cores or even a type of core but you can make the argument that choosing what kind of accelerator to run a workload on is a slightly bigger deal.
CoreML basically chooses for you, yeah.
 
Anandtech used to report Joules if I remember right - on occasion anyway. I’m fascinated that the time to completion is similar but the energy of the Qualcomm NPU is far lower than the Apple NPU. The only thing I can think of is that the cores are much bigger or more numerous but slower on the Qualcomm NPU as compared to the Apple one - much like a big GPU clocked lower or even a big CPU with lots of cores clocked lower will be more efficient than a smaller GPU/CPU clocked higher for the same work. Given how little information anyone gives about their NPUs the only information we’ll probably get is die shots.
That’s probably right. But that’s a valid way to do things. I think people here etc are going to be surprised, Qualcomm, MediaTek/Nvidia are NOT like AMD and Intel when it comes to energy and power and you’re going to see some gaps diminish in the next few years.

Like, even if Apple has an ST lead still, some of these gaps are in absolute and don’t change much from node to node, like idling.
 

I checked by downloading the leaked Dell papers, and what he says is correct. This is categorically in Apple territory if you had a cluster map of idle or low load power basically.

And even when comparing to MTL (someone else here, Jimmy, whom I am not trying to pick on but it’s a relevant point, made fun of them for comparing to ADL — Qualcomm didn’t do that, that was Dell, and MTL is not as much of an improvement as QC brings so this is still relevant) it’s still an advantage.
 
Intel’s Lunar Lake claims 20% lower package power relative to 8cx Gen 3 which was Qualcomm’s last generation using 2020 Arm IP on Samsung 5NM (which was worse than TSMC 5nm by 30% on power). It should still be better than what they currently build, but let me just say I am skeptical this will beat the X Plus by much lol, and it’s more costly to build.
IMG_2491.jpeg
 


This is a crazy amount of high-profile design wins. Sure, AMD and Intel have “more”, but keep in mind these are big hitter wins — so Qc only having 15-20 wins is eliding something since every design win isn’t equal in influence, mindshare or sales. Like, AMD still hasn’t been able to win the XPS range for example. This is impressive.
 
I’d be very careful with any claims from Twisted Andy. This is the person who claimed that Speedometer was not a valid benchmark but octane and kraken are. A person who openly stated that they look at data to see which fits their idea of reasonable before using it. A bullshitter in other words.

Especially important to people who like to accuse others of “zealotry”.
 
I’d be very careful with any claims from Twisted Andy. This is the person who claimed that Speedometer was not a valid benchmark but octane and kraken are. A person who openly stated that they look at data to see which fits their idea of reasonable before using it. A bullshitter in other words.

Especially important to people who like to accuse others of “zealotry”.
Andy is definitely terrible broadly and hates Apple, his comparisons are completely full of it in that vein and he cherry picks - I’m fairly unbiased and am aware of that, it’s very irritating.

But you can download the documents from dell yourself from Scribd with a subscription, the idle power is indeed that low. This is all Dell’s own internal measurement, not QC marketing, and I hate to say it but Andy is right.
 
I’d be very careful with any claims from Twisted Andy. This is the person who claimed that Speedometer was not a valid benchmark but octane and kraken are. A person who openly stated that they look at data to see which fits their idea of reasonable before using it. A bullshitter in other words.

Especially important to people who like to accuse others of “zealotry”.
And to be clear, others saw exactly what I did in your claims. The difference is I have actually verified what he said. I don’t want to esteem a product if it sucks.
 
Back
Top