Intel Lunar Lake thread

I think we’ve discussed that testing just has to be taken as a kind of big mixed bag. Noting that those would also be informative.

I still haven’t seen ST perf/w curves for LNL at a platform level from Intel or even package (whatever), and I strongly suspect the M3’s curve is meaningfully superior tbh
Intel has one here:


Slide #22 (numbers aren't on the presentation itself annoyingly, I just downloaded the PDF).

They claim the M3 is on their power curve. However, they use SpecInt for their performance metric and as usual they use their own Intel compiler to give themselves a huge advantage in that one test. You can see that on slide 20 comparing SpecInt with CB R24 and GB 6.3. So those results are kinda meaningless. Also on slide 22 they seemingly have the Elite 80 getting the same SpecInt MT performance as the base M3 when drawing 50W (which I cannot believe is accurate), but then draw the arrow from its performance/watt dot not to their own perf/W line and say "~40% lower power!" and have some note about "Intel Instrumented". This isn't explained as far as I can tell in the notes either on the bottom or at the end but maybe I missed it. Slide #22 is a very odd graph all around basically.

I suspect Intel is also downclocking the cores or rather using the E Cores almost entirely as much as possible to save power. Now, Apple uses theirs too and even Qualcomm we found out depending on the OEM will limit clocks, and all that matters is actual responsiveness and efficiency from a user perspective. But since I can’t actually experience that web browsing, I don’t know what it’s like.

It’s not really a task completion thing where we can measure efficiency (performance/watts) it’s just a run-on test with breaks.

Which is ecologically relevant! It’s fair! But as to the chip it means Intel could cheat this to a degree and the end user experience might feel a bit smoother on the Mac (or the X Elite system which is also quite close albeit with a higher resolution display) etc.

In other words when you have tests like this and low idle power you can get a good result and maybe people are fine with it but it may not be the case that the E or P cores are actually that impressive on a performance/W level, and we’re not really going to be able to know due to how the test is.

That's exactly what they're doing. They even say so themselves, explicitly, on the slide 19 about the Thread Director. Basically everything starts on the E-core only moves to the P-core if it actually has to. Which as you say, for something like watching video and other such tasks is perfectly fine! And is a huge improvement over where Intel was before so Dave2D's results from @Jimmyjames' video hold with respect to that. But yeah it doesn't tell us much about perf/W under real load and Intel's own results with respect to the latter are a bit sketchy to say the least.
 
Riddle me this: suppose'n they tested an 8Gb M3 against a 16Gb U7 258V (or even doubling both, since it was a "high-end" machine) – how would a RAM difference impact battery life (given that SSD writes use more juice)?
 
Intel has one here:


Slide #22 (numbers aren't on the presentation itself annoyingly, I just downloaded the PDF).

They claim the M3 is on their power curve. However, they use SpecInt for their performance metric and as usual they use their own Intel compiler to give themselves a huge advantage in that one test. You can see that on slide 20 comparing SpecInt with CB R24 and GB 6.3. So those results are kinda meaningless. Also on slide 22 they seemingly have the Elite 80 getting the same SpecInt MT performance as the base M3 when drawing 50W (which I cannot believe is accurate), but then draw the arrow from its performance/watt dot not to their own perf/W line and say "~40% lower power!" and have some note about "Intel Instrumented". This isn't explained as far as I can tell in the notes either on the bottom or at the end but maybe I missed it. Slide #22 is a very odd graph all around basically.



That's exactly what they're doing. They even say so themselves, explicitly, on the slide 19 about the Thread Director. Basically everything starts on the E-core only moves to the P-core if it actually has to. Which as you say, for something like watching video and other such tasks is perfectly fine! And is a huge improvement over where Intel was before so Dave2D's results from @Jimmyjames' video hold with respect to that. But yeah it doesn't tell us much about perf/W under real load and Intel's own results with respect to the latter are a bit sketchy to say the least.
Right. I mean look, using just enough as necessary is smart, Apple’s frequency ramping is fairly cautious for example relative to AMD and Intel arguably per chips n cheese (but they also have higher IPC, and can sustain it when they do boost! You can just run a 5W M1/2/3 ST indefinitely!)

But even then, you will expose this with MT or real responsiveness with battery life — and in a counterfactual with better cores, you could get the same battery life and more responsiveness or the opposite. So it’s not like this is “free” when comparing vs other vendors, who could just do the exact same thing for even more battery or fewer perf. Complaints.




Arm compatibility is one thing for now but from an engineering POV it’s telling that on N4 with a pretty small die that’s also very scalable - with just straight P cores and a first iteration — Qualcomn really isn’t far behind.

IMG_4888.jpeg

I mean, lol…. The Teams example is actually more representative of both the CPU and video decode + AI all in one, the AI stuff is easier to game with software stacks. [see pic 1]

And here [see pic 2 and 3]
IMG_4889.jpeg

IMG_4890.jpeg



It’s just fine, nothing amazing and really underwhelming in context. They are throwing more for less and a less scalable architecture and it against shows us this is the Intel way, it always has been. It also shows again memory on package isn’t a wonder for this stuff (though you could argue that Intel would do even worse without it which is true but by how much? Either way, I think Qualcomm will be fine without it for most segments short of <8W tablets, not dogging it though and maybe they’ll do it eventually)


At risk of being too grandiose, IIRC I predicted as much about LNL to you all here, said they might be ahead even but not by enough and it would be disappointing relative to the price/cost/effort, the caveat is that AMD is more substantially behind on battery life and Arm pains + QC GPU woes give Intel a fighting chance for this year.

But I don’t think that matters long term, and Arm compat + QC GPUs will be fixed. Even if Intel caught up, their profits are going to go down, because substitute goods via QC et. Al and AMD too are here. Not a great outlook.
 
Riddle me this: suppose'n they tested an 8Gb M3 against a 16Gb U7 258V (or even doubling both, since it was a "high-end" machine) – how would a RAM difference impact battery life (given that SSD writes use more juice)?
How can anyone answer that question without knowing what the load is? As you say, SSD writes are expensive, but if you're not paging at all, they're not relevant. Normal benchmarks won't page at all, and many won't use that much RAM, so maybe even allowing some lower-power (off?) modes for controllers and RAM.

The only reasonable way to answer that question is to benchmark machines against themselves (M3 8GB vs. M3 16GB, for example).

BTW you mean "GB".
 
Intel has one here:


Slide #22 (numbers aren't on the presentation itself annoyingly, I just downloaded the PDF).

They claim the M3 is on their power curve. However, they use SpecInt for their performance metric and as usual they use their own Intel compiler to give themselves a huge advantage in that one test. You can see that on slide 20 comparing SpecInt with CB R24 and GB 6.3. So those results are kinda meaningless. Also on slide 22 they seemingly have the Elite 80 getting the same SpecInt MT performance as the base M3 when drawing 50W (which I cannot believe is accurate), but then draw the arrow from its performance/watt dot not to their own perf/W line and say "~40% lower power!" and have some note about "Intel Instrumented". This isn't explained as far as I can tell in the notes either on the bottom or at the end but maybe I missed it. Slide #22 is a very odd graph all around basically.



That's exactly what they're doing. They even say so themselves, explicitly, on the slide 19 about the Thread Director. Basically everything starts on the E-core only moves to the P-core if it actually has to. Which as you say, for something like watching video and other such tasks is perfectly fine! And is a huge improvement over where Intel was before so Dave2D's results from @Jimmyjames' video hold with respect to that. But yeah it doesn't tell us much about perf/W under real load and Intel's own results with respect to the latter are a bit sketchy to say the least.
So, the thing about that curve RE: ST is that while it’s useful to directionally discern what’s going on, here it’s probably less useful than in any other case, because not only is it heterogeneous where the improvements aren’t necessarily going to be equal between the cores, but here we actually know for sure there is a huge difference between the advancements made — and we also know the E cores tap out at some point, especially in the LNL design because they use a different physical design and fabric + cache vs the full “E Core” Skymont. They’re LP E cores, just not crap like Meteor Lake’s last generation of LPE Crestmont cores were (because they were off-die and, well, crappy too).

So when they compare to a 14C meteor lake system with “2.1x perf/thread”, that’s a fine way to adjust and show improvement overall in MT — that if they scaled it to 14C they’d have an even better system etc, but it’s an aggregated MT measurement that definitionally is mixing LPE core performance/W improvements and P core performance/W improvements.

IMG_5064.jpeg

IMG_5066.jpeg



What I did just realize is they already gave us the answer, roughly the ST improvements for the cluster or die alone with these figures in the summer. The full package and platform powers are still pretty important too though and I suspect will take off a bit more potentially (maybe even a constant of some kind especially when p ring is off).

Here’s Lion Cove on Lunar

IMG_5067.jpeg



Imo this is important because e cores can improve battery life by doing background tasks or lower priority stuff at less power but if you aggregate the MT performance/W you might miss what the draw is going to be in order to retain a certain level of responsiveness under load or for big tasks under load etc.
 
Back
Top