Nuvia: don’t hold your breath

Artemis · Jun 21, 2024

Well with say M4 vs M1, if you held clocks constant I’d wager you’re looking at + 17-20% more integer performance and still like probably 40% lower power. Technically the wider arch would use slightly more power so you might have to lower clocks a bit more, but still, that’s all else equal.

Node alone would take you ~ 25-35% down from N5 for the CPU, but the extra L2, newer LPDDR5(x?) and other physical design improvements which we know they’ve seen probably means it’s higher than that in total for the package.

Idk that seems really good to me? Or doing it at similar power and 30% faster even if you don’t notice is still more energy efficient upon the king of energy efficiency.

The E Core improvements are arguably even more important/impressive in some ways

leman · Jun 22, 2024

By the way, in GB6 multicore Ray Tracer test Oryon shows a good lead over M2 Pro/Max, as expected. Since it uses the same RT library as Cinebench, this is a surprising discrepancy. Maybe the Cinebench workload is somehow less-than-optimal for Oryon, or maybe there is some other issue?

Jimmyjames · Jun 22, 2024

leman said:
By the way, in GB6 multicore Ray Tracer test Oryon shows a good lead over M2 Pro/Max, as expected. Since it uses the same RT library as Cinebench, this is a surprising discrepancy. Maybe the Cinebench workload is somehow less-than-optimal for Oryon, or maybe there is some other issue?

I assume you meant the M2 has a lead over Oryon?

Edit: or not. I’m confused. If the Oryon core leads how can CB be less than optimal for it?

leman · Jun 22, 2024

Jimmyjames said:
I assume you meant the M2 has a lead over Oryon?

Edit: or not. I’m confused. If the Oryon core leads how can CB be less than optimal for it?

Oryon is faster in GB6 multicore raytracing tests, but same speed in Cinebench tests as reported by notebookcheck. The GB6 results are consistent with Qualcomms marketing, Cinebench results are not.

dada_dave · Jun 22, 2024

Jimmyjames said:
I assume you meant the M2 has a lead over Oryon?

Edit: or not. I’m confused. If the Oryon core leads how can CB be less than optimal for it?

I haven't had time to check, but he means that in CB and overall GB6.2 the multicore scores for the Oryon Snapdragon are disappointingly similar to the M2/M3 Pro. However, for GB6.2 ray tracing the Oryon Snapdragon pulls ahead - as it frankly should with 12 P-cores. We know that overall that GB6.2 is weighted against large count systems, but CB R24 should behave better and it isn't.

Given the similarities between the M2 and Oryon core, the three most likely possibilities are: some sort of bad interaction between CB24 and (ARM) Windows, an effect of the 10-minute length of the test (Oryon not being able to maintain clocks even on battery), and some small difference in the architecture that makes a big difference here. Normally I'd favor the middle one but I think Tom's hardware ran CB multiple times in a row and the middle scores were the highest. So that doesn't fit if thermals or hotspots were the issue.

SAMSUNG ELECTRONICS CO., LTD. Galaxy Book4 Edge - Geekbench

Benchmark results for a SAMSUNG ELECTRONICS CO., LTD. Galaxy Book4 Edge with a Snapdragon X Elite - X1E80100 - Qualcomm Oryon processor.

browser.geekbench.com

Mac Studio (2023) - Geekbench

Benchmark results for a Mac Studio (2023) with an Apple M2 Max processor.

browser.geekbench.com

Compare the Ray tracing multicore

Having said that, here is Geekbench 5:

Snapdragon Oryon - Geekbench 5 CPU Search - Geekbench

browser.geekbench.com

Apple M2 Max - Geekbench 5 CPU Search - Geekbench

browser.geekbench.com

Haven't had time to actually parse these result so I'm not sure yet how to interpret them.

leman said:
By the way, in GB6 multicore Ray Tracer test Oryon shows a good lead over M2 Pro/Max, as expected. Since it uses the same RT library as Cinebench, this is a surprising discrepancy. Maybe the Cinebench workload is somehow less-than-optimal for Oryon, or maybe there is some other issue?

Then again CB R23 on AS was worse than ray tracing algorithms that also used Intel's Embree (GB and SPEC and others I saw). Given this, irrespective of Embree, CB always seems to be the most finicky. Sadly Andrei is probably unable to respond now even more than previously when it was CB R23 and AS but there seems to be something going on.

leman said:
Oryon is faster in GB6 multicore raytracing tests, but same speed in Cinebench tests as reported by notebookcheck. The GB6 results are consistent with Qualcomms marketing, Cinebench results are not.

Their marketing included CB results, but I'd say it makes more sense just given the core count. Then again, GB5 results don't look great, but there may be something I'm missing there.

Jimmyjames · Jun 22, 2024

leman said:
Oryon is faster in GB6 multicore raytracing tests, but same speed in Cinebench tests as reported by notebookcheck. The GB6 results are consistent with Qualcomms marketing, Cinebench results are not.

Ahh ok thanks.

Jimmyjames · Jun 22, 2024

dada_dave said:
I haven't had time to check, but he means that in CB and overall GB6.2 the multicore scores for the Oryon Snapdragon are disappointingly similar to the M2/M3 Pro. However, for GB6.2 ray tracing the Oryon Snapdragon pulls ahead - as it frankly should with 12 P-cores. We know that overall that GB6.2 is weighted against large count systems, but CB R24 should behave better and it isn't.

Given the similarities between the M2 and Oryon core, the three most likely possibilities are: some sort of bad interaction between CB24 and (ARM) Windows, an effect of the 10-minute length of the test (Oryon not being able to maintain clocks even on battery), and some small difference in the architecture that makes a big difference here. Normally I'd favor the middle one but I think Tom's hardware ran CB multiple times in a row and the middle scores were the highest. So that doesn't fit if thermals or hotspots were the issue.

SAMSUNG ELECTRONICS CO., LTD. Galaxy Book4 Edge - Geekbench

Benchmark results for a SAMSUNG ELECTRONICS CO., LTD. Galaxy Book4 Edge with a Snapdragon X Elite - X1E80100 - Qualcomm Oryon processor.

browser.geekbench.com

Mac Studio (2023) - Geekbench

Benchmark results for a Mac Studio (2023) with an Apple M2 Max processor.

browser.geekbench.com

Compare the Ray tracing multicore

Having said that, here is Geekbench 5:

Snapdragon Oryon - Geekbench 5 CPU Search - Geekbench

browser.geekbench.com

Apple M2 Max - Geekbench 5 CPU Search - Geekbench

browser.geekbench.com

Haven't had time to actually parse these result so I'm not sure yet how to interpret them.

Then again CB R23 on AS was worse than ray tracing algorithms that also used Intel's Embree (GB and SPEC and others I saw). Given this, irrespective of Embree, CB always seems to be the most finicky. Sadly Andrei is probably unable to respond now even more than previously when it was CB R23 and AS but there seems to be something going on.

Their marketing included CB results, but I'd say it makes more sense just given the core count. Then again, GB5 results don't look great, but there may be something I'm missing there.

Thanks. Appreciate the clarification.

Yoused · Jun 22, 2024

Jimmyjames said:
I assume you meant the M2 has a lead over Oryon?

Bear in mind that the M2 GPU does not have hardware RT like the M3. I could be mistaken, but it seems like the M3 GPU is parsecs ahead of the M2 GPU.

dada_dave · Jun 22, 2024

Yoused said:
Honestly, I am wondering what we are getting with these performance gains. For the vast majority of workloads, the difference is vanishing. Who truly cares or notices, outside of a few engineers and corner-case users? Serious work gets done in the EP modules – GPU and NPU/Tensor – improving CPU cores performance is an exercise in diminishing returns.

That is one thing that was noted in several reviews where the Qualcomm SOC didn’t perform great on certain workloads that often have dedicated accelerators for those tasks like video encoding and extraction/compression:

First Reviews are Live and Snapdragon X Elite Doesn't Quite Deliver on Promised Performance

The first reviews of a notebook with Qualcomm's Snapdragon X Elite SoC have appeared today, and it looks like the promised performance isn't quite there. And yes, all the reviews that went live today are all based on Asus' Vivobook S 15 OLED, so it might be a bit too early to state that Qualcomm...

www.techpowerup.com

Not sure the primary source for the data presented though. Might be thechpowerup itself? But it’s written more as a roundup.

Yoused said:
Bear in mind that the M2 GPU does not have hardware RT like the M3. I could be mistaken, but it seems like the M3 GPU is parsecs ahead of the M2 GPU.

The ray tracing here is being done on the CPU, maybe a less relevant workload these days as you allude to above. But it stresses the CPU and CPU vector processing and scales with core counts so still good for similar types of tasks even if CPU ray tracing is less relevant for most users aside from some production houses (and even those might move to GPU processors soon).

dada_dave · Jun 22, 2024

Artemis said:
Qualcomm has already confirmed Oryon is coming to 8 Gen 4 though. They’ll have E Cores too for sure and according to all the leaks/rumors.

2 Oryon Big
6 Oryon little

That's what I thought too, but now I can't find where I saw that. Someone on another forum found a rumor the little cores were actually going to be ARM cores and I found another that claimed the "little" cores weren't actually little, just down clocked P-cores. But I could've sworn I saw an article saying that they were planing on making a dedicated Oryon E-core. But now I'm not sure. Do you have a link?

mr_roboto · Jun 22, 2024

leman said:
That is exactly what I meant - I’d expect that Oryon very good performance at lower wattage. These particular tests paint a very different picture. Qualcomm claimed that Oryon can match M2 at lower power draw. Here we see Oryon barely holding out against M2 despite massive core count advantage and probably higher power consumption.

Ah, I mistook what you wrote and didn't look at the low watt test numbers.

Yoused · Jun 24, 2024

dada_dave said:
Someone on another forum found a rumor the little cores were actually going to be ARM cores and I found another that claimed the "little" cores weren't actually little, just down clocked P-cores. But I could've sworn I saw an article saying that they were planing on making a dedicated Oryon E-core. But now I'm not sure. Do you have a link?

There is a link in this post:

Yoused said:
The 8 gen 4 is 2 by X, 6 by 725. Odd that they have no 500-series cores in a phone – maybe those are going into specialized SoCs, or migrating toward R and M chips.

725 are mid-range ARM cores, possibly modified by QC.

dada_dave · Jun 24, 2024

Yoused said:
There is a link in this post:

725 are mid-range ARM cores, possibly modified by QC.

Right … that’s for the rumor that the middle cores will be ARM Cortex. But that site also claims that they won’t be using Oryons but Cortex X925 for the P-cores which I’m pretty sure is wrong. That would mean they wouldn’t be using their own cores for mobile and also it states they’ll be using P-cores clocked at 4.26GHz which is above the max clock speed allowed by ARM (3.8GHz) for X925. But maybe there’s wiggle room there.

What I was looking for was a link that the middle core is not going to be a X725 at all but a new custom Qualcomm E-core. I could’ve sworn I saw somewhere that but I am now unable to find it. I can now only find links that it’ll be X725 or down clocked Oryons gen2s.

dada_dave · Jun 24, 2024

Hey guys I just wrote this incredibly lengthy post on Macrumors about x86 vs ARM and why the latter, currently, has an advantage in performance/watt over the former. I'd appreciate any comments to clean it up as I may use it as reference going forwards.

The section on pipes and decode feels half baked but I don't know how to explain it better without making the post even longer and it is already so long I'm not sure anyone will read it. Obviously if there is anything wrong or something you would disagree with, any corrections would be appreciated.

Qualcomm revealed X Elite's benchmark scores

Fair enough. But in my defense, most of the data is already in the thread or pretty easy to check or concepts I thought were explainable with just logic alone. Anyway, that the LTT video is from a year and a half ago is in the timestamp of the video and the NotebookCheck article covers much of...

forums.macrumors.com

(the context is that a user posted a review where the reviewer power limited a 16-thread 8840U to the same wattage as a 12 thread Snapdragon Elite, ran multithreaded CB R24, and concluded that x86 could be just as power efficient as ARM and I'm trying to explain to the user why that result doesn't actually mean that and how that relates to the larger topic of x86 and ARM chips, previous post I reference in the one above)

leman · Jun 25, 2024

@dada_dave I like your post

Yoused · Jun 25, 2024

The number 4 machine on the Top 500 SC list is Fugaku, which is an ARM core thingy. Every other machine in the top 10 is x86, except for one Power9 installation – but, those other machines rely on GPGPU cards to do the EP work: Fugaku relies entirely on SVE (not SVE2). At 128 bit wide vectors.

Granted, Fugaku is not the most power efficient machine on the list. But it leaves one wondering, what if there was an ARM type installation like that, running SVE2, on wider vectors. It would be interesting to see.

leman · Jun 25, 2024

Yoused said:
The number 4 machine on the Top 500 SC list is Fugaku, which is an ARM core thingy. Every other machine in the top 10 is x86, except for one Power9 installation – but, those other machines rely on GPGPU cards to do the EP work: Fugaku relies entirely on SVE (not SVE2). At 128 bit wide vectors.

Fugaku uses 512-bit vectors, if I recall correctly. Anyway, it’s closer to a GPU than a general-purpose CPU. These are specialized processors, built to solve particular problems in science. I don’t think these designs inform anything about general-purpose processing.

KingOfPain · Jul 3, 2024

Ars Technica testing two Microsoft Surface products with Snapdragon X Elite:

Surface Pro 11 and Laptop 7 review: An Apple Silicon moment for Windows

Superfluous AI features and compatibility issues don't detract from good PCs.

arstechnica.com

Conslusion: It's way faster that previous Microsoft products with ARM, but still behind (current) Apple Silicon.

Interesting was the comment that despite improvements of Prism, non-native products were just annoying enough to warrant the search for native ports.
From my recollection, the only really annoying non-native application on my MacBook Air M1 was a web browser, because it needed loads of RAM and due to the double-translation of JavaScript it felt quite slow. This definitely improved a lot with the native port.
While I replaced everything else with native builds as soon as possible, because I wanted to reduce CPU usage, other non-native applications never felt annoying to me.

dada_dave · Jul 10, 2024

Chipsandcheese on the Oryon cores:

Qualcomm’s Oryon Core: A Long Time in the Making

In 2019, a startup called Nuvia came out of stealth mode.

chipsandcheese.com

I haven't had time to read the whole thing yet but given the title of the thread I thought the following snippet might be amusing!

Oryon arrives nearly five years after Nuvia hit the news, and almost eight years after Qualcomm last released a smartphone SoC with internally designed cores. For people following Nuvia’s developments, it has been a long wait.

Cmaier · Jul 10, 2024

dada_dave said:
Chipsandcheese on the Oryon cores:

Qualcomm’s Oryon Core: A Long Time in the Making

In 2019, a startup called Nuvia came out of stealth mode.

chipsandcheese.com

I haven't had time to read the whole thing yet but given the title of the thread I thought the following snippet might be amusing!

i hope nobody was “holding their breath”

Nuvia: don’t hold your breath

Site Champ

Elite Member

Elite Member

Elite Member

Elite Member

Elite Member

Elite Member

up

Elite Member

Elite Member

Site Champ

up

Elite Member

Elite Member

Elite Member

up

Elite Member

Site Champ

Elite Member

Site Master