Nvidia complains about AI PCs

Jimmyjames · May 26, 2024

Nvidia recently held an event where they stated that the upcoming “AI” PCs were not that great, and you’d be better buying an Nvidia GPU. Shocking.
Article here:

Nvidia criticizes AI PCs, says Microsoft's 45 TOPS requirement is only good enough for 'basic' AI tasks

Nvidia says its GPUs provide substantially better AI-performance than today's bleeding edge NPUs.

www.tomshardware.com

In any case I’m curious what would motivate them to pursue this line. They are killing it in the server AI space and that does look to be ending any time soon. Perhaps they are concerned that they will be pushed out of a growing sector of the market: thin laptops with great battery life which also have the power to perform useful AI/ML functions. It seems certain that this sector of the market isn’t going to use discreet gpus, so until they come up with their own SoC, they need to promote then solution they do have.

They also mention the M3 Max Mbp and compare it to the 4090/4050 laptop chip in terms of various AI tasks. Unsurprisingly the 4090 crushes the M3 Max, not always by as much as you might imagine however. It does show the gap between the two is significant however.

casperes1996 · May 26, 2024

I don’t think nvidia wants to settle on owning one market and letting competition get another. They just want all of the cake. I don’t think they intend to offer the super thin with long battery life product any time soon. But I think they will argue anything without nvidia is bad and try to convince customers that sacrificing half the battery life is fine cause their nvidia powered device is 20% faster.

dada_dave · May 27, 2024

casperes1996 said:
I don’t think nvidia wants to settle on owning one market and letting competition get another. They just want all of the cake. I don’t think they intend to offer the super thin with long battery life product any time soon. But I think they will argue anything without nvidia is bad and try to convince customers that sacrificing half the battery life is fine cause their nvidia powered device is 20% faster.

Rumors are that Nvidia will be releasing their own SOCs (possibly with MediaTek) sometime next year. It’s unclear what markets they’ll be competing in. But thin and light is definitely possible.

It should be noted that Intel’s Lunar Lake’s GPU will feature matrix accelerators:

Intel next-generation Lunar Lake CPUs launching in Q3, Arrow Lake in Q4 — mobile chips claimed to be 1.4x faster than Qualcomm's X Elite processors

AI marketing reaches a fever pitch.

www.tomshardware.com

So having thin and lights with GPU accelerated ML are not mutually exclusive. Personally I’m hoping for multiple tiers of Nvidia-based SOCs like Apple makes but that may be a bit much.

Jimmyjames said:
Nvidia recently held an event where they stated that the upcoming “AI” PCs were not that great, and you’d be better buying an Nvidia GPU. Shocking.
Article here:

Nvidia criticizes AI PCs, says Microsoft's 45 TOPS requirement is only good enough for 'basic' AI tasks

Nvidia says its GPUs provide substantially better AI-performance than today's bleeding edge NPUs.

www.tomshardware.com

In any case I’m curious what would motivate them to pursue this line. They are killing it in the server AI space and that does look to be ending any time soon. Perhaps they are concerned that they will be pushed out of a growing sector of the market: thin laptops with great battery life which also have the power to perform useful AI/ML functions. It seems certain that this sector of the market isn’t going to use discreet gpus, so until they come up with their own SoC, they need to promote then solution they do have.

They also mention the M3 Max Mbp and compare it to the 4090/4050 laptop chip in terms of various AI tasks. Unsurprisingly the 4090 crushes the M3 Max, not always by as much as you might imagine however. It does show the gap between the two is significant however.

It is odd that Apple is reportedly going all in on AI for this WWDC but seemingly only has a slightly better NPU on the M4 and of course SME on the CPU but didn’t add anything to the GPU. I know that I’m a bit of a broken record on this but Apple’s unified memory with huge VRAM pools is a substantial advantage in this market.

Then again, as everyone has already said, Apple’s internal AI efforts have been a mess* and they lost a substantial first mover advantage despite their boasting about having had an NPU longer than anyone else. I view that statement in their M4 announcement much like Nvidia’s here. Also, slight aside, they don’t get brownie points for “protecting privacy” and not developing their own high end models but then outsourcing the privacy violations to OpenAI or Google and licensing their models.

*there’s some nice stuff that they’ve done but obviously Siri is hardly cutting edge anymore and has stagnated and a lot of other efforts have just died on the vine.

casperes1996 · May 27, 2024

SoC doesn’t have to mean thin and light though. When I hear nvidia SoC I hear Grace Hopper scaled down to a consumer tier. Different architecture and all I just mean from a power perspective. I’m imagining hardware similar in power draw to laptops with intel + 4070 laptop edition. At least in peak consumption. Hopefully the floor is lower.

dada_dave · May 27, 2024

casperes1996 said:
SoC doesn’t have to mean thin and light though. When I hear nvidia SoC I hear Grace Hopper scaled down to a consumer tier. Different architecture and all I just mean from a power perspective. I’m imagining hardware similar in power draw to laptops with intel + 4070 laptop edition. At least in peak consumption. Hopefully the floor is lower.

Hmmm … I think that’s an odd contention. Now truthfully I would prefer an M3 Max-like SOC as I would buy such a thing as my CUDA development machine, but:

1) it would be an ARM-based SOC, so it wouldn’t suffer from the same performance per watt characteristics as an Intel machine would in lower power envelopes.

2) Nvidia has been, perhaps preemptively, dropping x050-class discrete processors likely because there are both fewer margins and greater competition from APUs. Small SOCs with smaller GPUs are therefore currently the biggest hole in their lineup and wouldn’t compete with their own (mobile) dGPU business. They could go all the way to Mx Pro level GPUs and not do that but that brings me to my final point.

3) the biggest push in the Windows on ARM is currently taking on the Air/low-end Pro end of the market. Large SOCs are expensive to produce and as of yet Windows on ARM is still unproven in the PC space. The rumored joint venture between MediaTek and Nvidia could jump into the expensive high end hardware market to carve out a niche for itself but that entails a lot of risk - it’s safer to start out small.

For those reasons I think it more likely to see small, light solutions from Nvidia/MediaTek first before anything big and powerful and power hungry gets released.

casperes1996 · May 27, 2024

dada_dave said:
Hmmm … I think that’s an odd contention. Now truthfully I would prefer an M3 Max-like SOC as I would buy such a thing as my CUDA development machine, but:

1) it would be an ARM-based SOC, so it wouldn’t suffer from the same performance per watt characteristics as an Intel machine would in lower power envelopes.

2) Nvidia has been, perhaps preemptively, dropping x050-class discrete processors likely because there are both fewer margins and greater competition from APUs. Small SOCs with smaller GPUs are therefore currently the biggest hole in their lineup and wouldn’t compete with their own (mobile) dGPU business. They could go all the way to Mx Pro level GPUs and not do that but that brings me to my final point.

3) the biggest push in the Windows on ARM is currently taking on the Air/low-end Pro end of the market. Large SOCs are expensive to produce and as of yet Windows on ARM is still unproven in the PC space. The rumored joint venture between MediaTek and Nvidia could jump into the expensive high end hardware market to carve out a niche for itself but that entails a lot of risk - it’s safer to start out small.

For those reasons I think it more likely to see small, light solutions from Nvidia/MediaTek first before anything big and powerful and power hungry gets released.

I certainly see this as a possibility. My main thinking is that nvidia behavior the past several years to me indicates that they give no shits about the lower computational tier of the market. Especially as you say 50-class and below products being de proritised and each class being more expensive than it used to be in the first place. And I think nvidia would dislike the association with machines that struggle to run <insert latest game> on a brand new device

dada_dave · May 27, 2024

casperes1996 said:
I certainly see this as a possibility. My main thinking is that nvidia behavior the past several years to me indicates that they give no shits about the lower computational tier of the market. Especially as you say 50-class and below products being de proritised and each class being more expensive than it used to be in the first place. And I think nvidia would dislike the association with machines that struggle to run <insert latest game> on a brand new device

I view their exit from that portion of the market as more as a cursed design problem. In order to make a compelling product at the lower end, especially for mobile, while still making a profit, there are too many contradictory requirements: you have to be low power because such a system is typically going into smaller, lighter enclosure, you have to offer more TFLOPs than an iGPU/SOC design, you have to have enough vRAM to support the TFLOPs and, again, compete with the iGPU/SOC, and you can't cost too much because the OEM already has to source all the other disparate components.

Let's compare the 4060 mobile and the M3 Max. In the 3000 generation, Nvidia tried a bunch different options at the low end: https://www.tomshardware.com/pc-components/gpus/rtx-4060-vs-rtx-3060-12gb-gpu-faceoff with all sorts of tradeoffs. For the 4000 series, they released the 4060 and basically straddled the 3050 and 3060 - faster than either obviously but with the RAM of the former. In order to make a 4050 it would have to have fewer units with less VRAM but we've already seen the discussions here and elsewhere about how limiting these smaller pools of RAM can be. Cutting 8GB down to 6 could make the card effectively useless for all but the lightest tasks so why not just save some money and use a nice iGPU anyway? The next way to save money is cutting down on execution units but then running fast. They do this for the 4060 and the result is a card that runs pretty damn hot for the TFLOPS, meanwhile a cut down M3 Max has nearly the same TFLOPs (10% less) but less than half the wattage because Apple can afford the silicon (it has 25% more execution units) and runs the clocks slower (35% slower). That means you have to either run the 4060 at lower clocks to make the 4050 which is expensive in terms of die size, lowering margins at the low end, or cut down the execution units and keep high watts which is not great for those smaller lighter laptops. And again, they're trying to compete with better and better iGPUs/SOCs which can fulfill the same, or similar enough, niche. Even beyond Apple, buying a complete package from Qualcomm, AMD, Intel, or, in the future, MediaTek/Nvidia is cheaper than buying two or more systems from multiple vendors each of which is trying to make a profit on their individual pieces. So much so, that we see from such SOCs the ability to indeed spend die area more liberally than from component vendors.

Finally, depending on the terms of the deal, the MediaTek-Nvidia SOC could very well be similar to the Samsung-AMD deal, basically its a MediaTek SOC but with the GPU cores licensed from Nvidia. In other words, we're looking at something that MediaTek is designing but they're taking a PC-vendor's GPU IP instead of licensing the GPU from ARM or ImgTech. And we've seen from Apple and Qualcomm (and AMD/Intel now too) that designing an iGPU need not deliver crap performance. The current lower XBOX model, the XBOX Series S, for instance only has 4 TFLOPs itself (yes as a console it has certain advantages but some of those are shared by the newer SOC designs). With Nvidia drivers, a decent pool of LPDDR ram and bandwidth, a decent 4-5TFLOP GPU designed with lots of cores run at lower clocks, you could get a pretty decent GPU for good power. Maybe still requires a fan, maybe not, but it could certainly compete with what MS is calling MB Air competitive devices with the Snapdragons (even if I would qualify them more as cheaper MB Pros).

mr_roboto · May 27, 2024

I think Nvidia is, above all else, very distracted by datacenter compute. They're milking the AI bubble as much as they can, while they can.

dada_dave · May 28, 2024

dada_dave said:
I view their exit from that portion of the market as more as a cursed design problem. In order to make a compelling product at the lower end, especially for mobile, while still making a profit, there are too many contradictory requirements: you have to be low power because such a system is typically going into smaller, lighter enclosure, you have to offer more TFLOPs than an iGPU/SOC design, you have to have enough vRAM to support the TFLOPs and, again, compete with the iGPU/SOC, and you can't cost too much because the OEM already has to source all the other disparate components.

Let's compare the 4060 mobile and the M3 Max. In the 3000 generation, Nvidia tried a bunch different options at the low end: https://www.tomshardware.com/pc-components/gpus/rtx-4060-vs-rtx-3060-12gb-gpu-faceoff with all sorts of tradeoffs. For the 4000 series, they released the 4060 and basically straddled the 3050 and 3060 - faster than either obviously but with the RAM of the former. In order to make a 4050 it would have to have fewer units with less VRAM but we've already seen the discussions here and elsewhere about how limiting these smaller pools of RAM can be. Cutting 8GB down to 6 could make the card effectively useless for all but the lightest tasks so why not just save some money and use a nice iGPU anyway? The next way to save money is cutting down on execution units but then running fast. They do this for the 4060 and the result is a card that runs pretty damn hot for the TFLOPS, meanwhile a cut down M3 Max has nearly the same TFLOPs (10% less) but less than half the wattage because Apple can afford the silicon (it has 25% more execution units) and runs the clocks slower (35% slower). That means you have to either run the 4060 at lower clocks to make the 4050 which is expensive in terms of die size, lowering margins at the low end, or cut down the execution units and keep high watts which is not great for those smaller lighter laptops. And again, they're trying to compete with better and better iGPUs/SOCs which can fulfill the same, or similar enough, niche. Even beyond Apple, buying a complete package from Qualcomm, AMD, Intel, or, in the future, MediaTek/Nvidia is cheaper than buying two or more systems from multiple vendors each of which is trying to make a profit on their individual pieces. So much so, that we see from such SOCs the ability to indeed spend die area more liberally than from component vendors.

Finally, depending on the terms of the deal, the MediaTek-Nvidia SOC could very well be similar to the Samsung-AMD deal, basically its a MediaTek SOC but with the GPU cores licensed from Nvidia. In other words, we're looking at something that MediaTek is designing but they're taking a PC-vendor's GPU IP instead of licensing the GPU from ARM or ImgTech. And we've seen from Apple and Qualcomm (and AMD/Intel now too) that designing an iGPU need not deliver crap performance. The current lower XBOX model, the XBOX Series S, for instance only has 4 TFLOPs itself (yes as a console it has certain advantages but some of those are shared by the newer SOC designs). With Nvidia drivers, a decent pool of LPDDR ram and bandwidth, a decent 4-5TFLOP GPU designed with lots of cores run at lower clocks, you could get a pretty decent GPU for good power. Maybe still requires a fan, maybe not, but it could certainly compete with what MS is calling MB Air competitive devices with the Snapdragons (even if I would qualify them more as cheaper MB Pros).

I should point out an error in the above that Nvidia does in fact make two mobile 4050s with 6GBs of RAM … and yes they sit basically between an M3 Pro and cut down Max in TFLOPs (lower end closer to Pro). I maintain though that’s as low as they can go though without hitting the “why shouldn’t I just use an iGPU instead” these days. Part of me must’ve known that because as I said earlier, they can go as high as a Mx Pro GPU without cannibalizing their own mobile dGPUs market (maybe they don’t care about that). I think I got confused with desktops and mobile where the 4050 desktop never got released as the 4060 basically took its place (priced between the 3050 and 3060 12 GB).

Also interesting to note that the 3060 mobile looks very similar to a cut down M3 Max in design - more energetically expensive than it probably should have been largely because Samsung 8nm was not a great node compared to TSMC 3nm.

NVIDIA GeForce RTX 3060 Mobile Specs

NVIDIA GA106, 1425 MHz, 3840 Cores, 120 TMUs, 48 ROPs, 6144 MB GDDR6, 1750 MHz, 192 bit

www.techpowerup.com

But obviously Samsung 8nm is also a cheaper node than TSMC 3nm too, which is why when Nvidia switched to TSMC 5nm (at the time close to the leading edge) I have no doubt economic realities spurred the change to a narrower, faster design with more heat. Although the 4050 mobile isn’t awful here - especially the MaxQ variant - and neither is the 4060 MaxQ. The high heat on the regular 4060 mobile is probably why those exist. Sure they’re slower but the TDPs are far more reasonable for a laptop.

And here’s the 4070 MaxQ variant:

NVIDIA GeForce RTX 4070 Max-Q Specs

NVIDIA AD106, 1230 MHz, 4608 Cores, 144 TMUs, 48 ROPs, 8192 MB GDDR6, 2000 MHz, 128 bit

www.techpowerup.com

So they did port over the 3060 mobile design - it’s just now increased by 1 price tier rung (to be fair even more cores at even lower clocks than a cut down M3 Max!). Although I’m having trouble finding laptops with it?

Bottom line: Nvidia still cares about that market, it’s just economic (and physical) realities of competing against iGPUs at the sub-Mx Pro level work against releasing dGPU products there. Releasing an SOC (with a partner) with their own iGPU solves that.

Personally I’d rather them release an M3 Max-like but I just don’t see that happening, not right away anyway.

Artemis · Jun 1, 2024

casperes1996 said:
I don’t think nvidia wants to settle on owning one market and letting competition get another. They just want all of the cake. I don’t think they intend to offer the super thin with long battery life product any time soon. But I think they will argue anything without nvidia is bad and try to convince customers that sacrificing half the battery life is fine cause their nvidia powered device is 20% faster.

Yep.

It’s crazy I’ve had this argument with peoole where they claim Nvidia doesn’t care about consumer or anything because profit in the datacenter and they’ll slowly exit out, lol.

This is totally wrong, IMHO. Here’s why:

I think people were looking at some scarce wafer allocations towards datacenter GPUs or rising prices for consumer GPUs in that local minimum, along with now their AI profit now, and think, hey Nvidia doesn’t care, there’s just no way. It’s a waste.

But Nvidia got to where it was via amortization of their fixed costs with consumer hardware and scaling up, but also with something more important: some bottom-up consumer mindshare and software marketshare.

That’s really why they’re able to charge such a pretty penny, the software stack and the fact that they bet early on it where academics and techies (I know of some) could fiddle with their DL software easily was tremendous in this influence.

So too was going horizontal with software costs and IP by applying some of their deep learning know how to gaming and building a moat that enables them to charge a justifiable premium and squeeze much more out of every TFLOP or mm^2.

So Nvidia sees the broader picture, IMO, and they also understand the economies of scale involved in chip design and manufacturing which is part of why you see them into Auto now even if margins aren’t as high. Some of this design stuff is transferable and “low margins in client” hides a lot of what it’s actually doing for the company at a firm-wide level. For instance they might could re-use an SoC for PCs for cars, or share info on tapeouts, place more orders with TSMC etc.

Bringing that together, firms don’t necessarily have unified incentives. Jensen might not be super personally interested in profit from a PC SoC alone, but build a division with someone accountable and there you have it, keeping the ship afloat as long as they don’t Intel it.

To the final point: Nvidia hasn’t done anything with serious mobile parts really because they lost phones without modems Arm on Windows has just sucked, while Cortex wasn’t really there yet with the whole package until recently.

Now WoA is taking off to the point of being acceptable with Qualcomm and MS paving the way gives Nvidia a much smoother landing. AI is also booming in the mainstream, and Nvidia doesn’t want to lose out on the next generation of AI, be it inference or training and tinkerers. It’s not really about a 8-10% margin hardware division alone. It’s about avoiding becoming the IBM of AI — or having AMD and Intel end up eating their lunch with bigger and bigger SoCs and even chiplet SoCs that basically hedge the cost over with a “GPU tile” and a fat stack of LPDDR to access.

So:

- Arm on Windows
- Cortex cores getting “good enough” with a power advantage on Intel/AMD
- The rise of AI + local inference & interest in keeping that
- and threats from Intel and AMD with larger APUs, and even dGPU replacement APUs built economically with chiplets

Make it not only a lower cost, higher reward time to jump in but also raise the stakes of inaction.

TL;DR: If Nvidia doesn’t launch leading SoCs of their own or with MediaTek on the 15W to 120W humongous bus continuum, they’re flat-out dumb or just myopic IMHO. And the fact that they’re raking it in only makes this a stronger and lower cost strategic play as opposed to sitting on cash.

Artemis · Jun 1, 2024

@casperes1996 (tbc, I also totally agree they’ll probably want to have a part that’s really beefy, with like low idle as a benefit and 70-85% of the ST perf of Intel/Apple/AMD, and I suspect long term we could see them with two dies), but my point is in agreement with you — just spilling.

dada_dave · Jun 2, 2024

Windows Copilot will add GPU support in a future release — Nvidia details the advantages of high performance GPUs for AI workloads and more

Firing up a GPU versus an NPU will still create a massive drain on your laptop battery.

www.tomshardware.com

Windows Copilot will add GPU support in a future release — Nvidia details the advantages of high performance GPUs for AI workloads and more

Nvidia complains about AI PCs

Jimmyjames

Site Champ

Nvidia criticizes AI PCs, says Microsoft's 45 TOPS requirement is only good enough for 'basic' AI tasks

casperes1996

Site Champ

dada_dave

Elite Member

Intel next-generation Lunar Lake CPUs launching in Q3, Arrow Lake in Q4 — mobile chips claimed to be 1.4x faster than Qualcomm's X Elite processors

Nvidia criticizes AI PCs, says Microsoft's 45 TOPS requirement is only good enough for 'basic' AI tasks

casperes1996

Site Champ

dada_dave

Elite Member

casperes1996

Site Champ

dada_dave

Elite Member

mr_roboto

Site Champ

dada_dave

Elite Member

NVIDIA GeForce RTX 3060 Mobile Specs

NVIDIA GeForce RTX 4070 Max-Q Specs

Artemis

Power User

Artemis

Power User

dada_dave

Elite Member

Windows Copilot will add GPU support in a future release — Nvidia details the advantages of high performance GPUs for AI workloads and more

Windows Copilot will add GPU support in a future release — Nvidia details the advantages of high performance GPUs for AI workloads and more

Similar threads

Nvidia complains about AI PCs

Site Champ

Site Champ

Elite Member

Site Champ

Elite Member

Site Champ

Elite Member

Site Champ

Elite Member

Power User

Power User

Elite Member

Windows Copilot will add GPU support in a future release — Nvidia details the advantages of high performance GPUs for AI workloads and more​

Similar threads

Windows Copilot will add GPU support in a future release — Nvidia details the advantages of high performance GPUs for AI workloads and more