Strix Halo - AMD

exoticspice1

Site Champ
Joined
Jul 19, 2022
Posts
382

Looks like reviews are out today. I'm curious to see how it compares to M4 Pro.
 

Looks like reviews are out today. I'm curious to see how it compares to M4 Pro.
My guess based on available information so far is that (compared with the 14/20 CPU/GPU core Pro) the Halo’s CPU will top out as/less powerful* and much more power hungry while its GPU will be faster in most, but not all tests and definitely more power hungry. The GPU is unlikely to reach 32-core Max except in native vs non-native/badly optimized games. Anything bandwidth intensive the 20-core Pro GPU might win given that most Strix Halo devices I’ve seen come with the same memory bus but lower speed RAM. If the price of 128GB is accurate, that’s a good deal even if it has a couple of caveats.

*in embarrassingly parallel multicore, the ST/lightly threaded scenarios it’ll be no contest, the M4 Pro CPU will be better.
 
Last edited:
My guess based on available information so far is that (compared with the 14/20 CPU/GPU core Pro) the Halo’s CPU will top out as/less powerful* and much more power hungry while its GPU will be faster in most, but not all tests and definitely more power hungry. The GPU is unlikely to reach 32-core Max except in native vs non-native/badly optimized games. Anything bandwidth intensive the 20-core Pro GPU might win given that most Strix Halo devices I’ve seen come with the same memory bus but lower speed RAM. If the price of 128GB is accurate, that’s a good deal even if it has a couple of caveats.

*in embarrassingly parallel multicore, the ST/lightly threaded scenarios it’ll be no contest, the M4 Pro CPU will be better.

Scanning the above quickly, NBC's analysis largely backs up my prediction, although I was thinking 4060 Laptop GPU equivalent which seems to be true for some of the tests, but in gaming, it can also compete with a 4070 laptop which is overall how NBC decided to describe it. It depends on how high the Nvidia GPU has been clocked (some OC's laptop 4050s even compete) and power levels. So definitely quite nice.

Unfortunately only one data point for my CB R24 graph.
 
Screenshot 2025-02-18 at 9.12.49 AM.png

Click to expand. Things are getting a bit scrunched up, haven't had time to prune the data points (unfortunately Numbers isn't great here, change the order or remove a data point and the entire graph's style has to be redone).

With respect to the Strix Halo in Asus ROG Flow (AI Max+ 395) I have to admit this a little worse on the CPU side than I thought it would be. I expected higher power/lower efficiency for the ST but not this bad and I thought there would be better performance than the Strix Point. Instead it is just same performance, much worse efficiency - not totally unexpected given the larger SOC design, but this is substantially worse. It almost uses as much power in ST as the base M4 does in multicore. Keep in mind, as per my conversation with @Artemis in the other thread, this is with the idle power removed and the idle power here is substantial - 11.2W and this all due to the SOC, no dGPU to blame and it is a less than 14" laptop/tablet hybrid. With only one data point I don't know if that is a feature of the Halo's design or Asus' implementation.

Similar story with multithreaded efficiency. Here performance is around where I thought it would be, but efficiency is way worse. I knew it would be lower, I didn't think the 14-core M4 Pro would be over 40% more efficient (side note: in this analysis I use the 14-core M4 Pro from the 14" Pro which had a slightly lower performance and slightly better efficiency than the 16" device reported in the Halo article). I don't know if Strix Halo is using TSMC's 4NX or 4NP. If the former, that could explain some of the efficiency losses for both ST and MT and it uses a similar structure to the desktop chip dies (2 CCDs and full AVX-512). Although ... I guess it's getting 41% more performance than the Strix Point for roughly the same power increase when the HX 370 nets a score of 1166, so maybe at lower power the Halo will be much more efficient and come closer to the 12-core M4 Pro? Running the numbers based on the Strix Point's performance curve that is certainly plausible. Of course the 12-core M4 Pro is considerably smaller with only 8+4 P+E-cores than the 16-core Halo with hyperthreading and we don’t know what the performance curve of the Halo actually is, it could be shallower/steeper than the Point’s. What we do know is that at this singular performance point, the Halo is using a fair bit more power than the 16-core M4 Max to just barely lose to the 14-core M4 Pro. So maybe 60-70W will be its sweet spot like 40-50W was for the Strix Point (wall power not TDP).

Price is higher than I thought it would be too (2500 Euros for 32GB model) - so now I'm not sure if the price I saw quoted for the 128GB model is accurate (I'd previously seen placeholder like $2600 for the 128GB model and $2100 for the 32GB model which I don't see how that would be possible unless Europe is getting screwed here).

GPU looks nice. Most of the rendering/macOS native gaming benchmarks put it in-between the 40-core and 20-core M4 GPUs (no 32-core model tested, but my sense is it should be better than or at worst equivalent to the Halo's GPU) except in a couple of tests Apple is known to do well in where the 20-core M4 GPU actually beats it. It uses slightly more power in the CP 2077 test than the M4 Max while getting better performance/efficiency than the M4 Pro/Max, but obviously that game is not macOS native. Also, no really memory intensive benchmarking done here (this was the smaller 32GB RAM capacity but still). Like another commenter on the analysis article already said, I would've liked to see some AI benchmarks since that is a major purpose the Halo is being advertised for - also wouldn't have minded some Blender render benchmarks although RDNA 3.5’s ray tracing has generally been subpar. There are other reviews out there I just haven't had time to look at them, so maybe some of them have those.


 
Last edited:
View attachment 33916

Click to expand. Things are getting a bit scrunched up, haven't had time to prune the data points (unfortunately Numbers isn't great here, change the order or remove a data point and the entire graph's style has to be redone).

With respect to the Strix Halo in Asus ROG Flow (AI Max+ 395) I have to admit this a little worse on the CPU side than I thought it would be. I expected higher power/lower efficiency for the ST but not this bad and I thought there would be better performance than the Strix Point. Instead it is just same performance, much worse efficiency - not totally unexpected given the larger SOC design, but this is substantially worse. It almost uses as much power in ST as the base M4 does in multicore. Keep in mind, as per my conversation with @Artemis in the other thread, this is with the idle power removed and the idle power here is substantial - 11.2W and this all due to the SOC, no dGPU to blame and it is a less than 14" laptop/tablet hybrid. With only one data point I don't know if that is a feature of the Halo's design or Asus' implementation.

Similar story with multithreaded efficiency. Here performance is around where I thought it would be, but efficiency is way worse. I knew it would be lower, I didn't think the 14-core M4 Pro would be over 40% more efficient (side note: in this analysis I use the 14-core M4 Pro from the 14" Pro which had a slightly lower performance and slightly better efficiency than the 16" device reported in the Halo article). I don't know if Strix Halo is using TSMC's 4NX or 4NP. If the former, that could explain some of the efficiency losses for both ST and MT and it uses a similar structure to the desktop chip dies (2 CCDs and full AVX-512). Although ... I guess it's getting 41% more performance than the Strix Point for roughly the same power increase when the HX 370 nets a score of 1166, so maybe at lower power the Halo will be much more efficient and come closer to the 12-core M4 Pro? Running the numbers based on the Strix Point's performance curve that is certainly plausible. Of course the 12-core M4 Pro is considerably smaller with only 8+4 P+E-cores than the 16-core Halo with hyperthreading and we don’t know what the performance curve of the Halo actually is, it could be shallower/steeper than the Point’s. What we do know is that at this singular performance point, the Halo is using a fair bit more power than the 16-core M4 Max to just barely lose to the 14-core M4 Pro. So maybe 60-70W will be its sweet spot like 40-50W was for the Strix Point (wall power not TDP).

Price is higher than I thought it would be too (2500 Euros for 32GB model) - so now I'm not sure if the price I saw quoted for the 128GB model is accurate (I'd previously seen placeholder like $2600 for the 128GB model and $2100 for the 32GB model which I don't see how that would be possible unless Europe is getting screwed here).

GPU looks nice. Most of the rendering/macOS native gaming benchmarks put it in-between the 40-core and 20-core M4 GPUs (no 32-core model tested, but my sense is it should be better than or at worst equivalent to the Halo's GPU) except in a couple of tests Apple is known to do well in where the 20-core M4 GPU actually beats it. It uses slightly more power in the CP 2077 test than the M4 Max while getting better performance/efficiency than the M4 Pro/Max, but obviously that game is not macOS native. Also, no really memory intensive benchmarking done here (this was the smaller 32GB RAM capacity but still). Like another commenter on the analysis article already said, I would've liked to see some AI benchmarks since that is a major purpose the Halo is being advertised for - also wouldn't have minded some Blender render benchmarks although RDNA 3.5’s ray tracing has generally been subpar. There are other reviews out there I just haven't had time to look at them, so maybe some of them have those.



On the other site Xiao_Xi posted a pretty good video review and though I'm not sure how the reviewer measures power, it seems to confirm to me that the Strix Halo would be substantially more efficient at lower power levels, not a surprise perhaps and we still need to get exact power measurements at the lower power/performance points as NBC provides. But I strongly suspect that the "70W" TDP setting is fairly far along the curve.

The reviewer also did AI workloads and confirmed that for programs optimized Apple Silicon, the memory bandwidth of the MBP should provide better performance. The reviewer didn't do a GPU Blender benchmark unfortunately.

I will also try to do a silicon analysis based on the techpowerup article soon.
 
Price is higher than I thought it would be too (2500 Euros for 32GB model) - so now I'm not sure if the price I saw quoted for the 128GB model is accurate (I'd previously seen placeholder like $2600 for the 128GB model and $2100 for the 32GB model which I don't see how that would be possible unless Europe is getting screwed here).

Arstechnica quoting $2300 for what appears to be the same model:


Is the detachable keyboard included in the price? I dunno, someone is either wrong or maybe just Europe is getting the short end of the stick?
 
Framework offering mini PC Strix Halo 128GB RAM for $2000.

Looks like a great M4 Pro Mini competitor:


Unfortunately, RAM isn't expandable, but you can add up to 16 GB of storage via two NVMe slots.

GB is a typo, it's TB.
 
Framework offering mini PC Strix Halo 128GB RAM for $2000.

Looks like a great M4 Pro Mini competitor:




GB is a typo, it's TB.
A couple notes on pricing for the Framework. There is also the 64GB model which starts at the same price ($1599) as the 24GB (full) Mac Mini Pro BUT doesn't come with OS, fan, hard drive (although hard drive upgrades unsurprisingly far better priced - still we'll compare base to base), front panels, etc ... Depending on your options you can quickly get to $1900-$2000 range - still a better deal than the M4 Mini Pro at $2100 (and again much better depending on your hard drive choice), but not quite as good as it might first appear based on the base price which doesn't actually include any of the above. And while there may be more "value" brands appearing soon, this does anchor the price expectations a little differently. These same add-ons are required for the 128GB and 32GB models as well:


So that needs to be taken into consideration when discussing price.
 
A more complete Strix Halo (and Point analysis) versus the M4:

Notebookcheck added extra TDP figures to its Halo analysis. Unfortunately they did not measure wall power at each TDP. However, while TDP doesn't not correspond well to wall power in an absolute sense, the Strix Point analysis where they did indeed measure both seems to confirm that relative TDP values are proportional to the wall power. So, assuming the same is true for the Halo, I can still use the TDP values to estimate wall power assuming 70W TDP for the default state wall power was measured in. I could do the same for the Arrow Lake analysis, but haven't and also don't have the same evidence that Intel's CPU might behave the same way (although it is likely). So that remains at two data points including the odd Zenbook result.


Screenshot 2025-02-26 at 10.19.45 AM.png

As expected from the previous analysis above, Halo's MT efficiency improves dramatically as power draw lowers. It is well off the 14-core's efficiency at similar power and doesn't quite reach the 12-core M4 Pro's efficiency, but is closer, only about 11% lower. At first blush, this is not a bad result for AMD. Strix Point is only about 10% less efficient than the base M4 while the Halo is about 11% lower than the 12-core M4 Pro. However, it should be pointed out that these chips are massive compared to the Apple ones I'm comparing them to. Further, CB R24 is a nearly embarrassingly parallel MT test where SMT2 yields around a 25% increase in performance. Strix Point therefore essentially has 12 P-cores (4 Zen 5 AVX 256bit cores + 8 Zen 5c cores) and 15 Performance threads equivalent while the base M4 has only 4 P-cores + 6 E-cores for maybe 6 performance threads equivalent. Meanwhile, Halo has 16 full Zen 5 AVX-512 cores for 20 performance cores equivalent as compared to the 12-core M4 Pro which has 8 P-cores and 4 E-cores for roughly 9.3 performance threads equivalent. Things get even more stark when considering die size. Based on annotated die shots, I've estimated that Strix Point CPU is roughly the same size in mm^2 as the M4 Max! (the M4 Max CPU size is estimated based on multiplying the appropriate component sizes as measured by the M4 die shot). Meanwhile the Halo is nearly double that size.

Here are my estimates:

mm^2
Apple M427.0
Apple M4 12-core Pro46.1
Apple M4 14-core Pro52.2
Apple M4 Max58.2
Strix Point58.1
Strix Halo109.5

A small note about what it is included. SLC is NOT included for any SOC above, neither are Apple's E/P-core AMX units (adds about 10%). It includes AMD's L1-L3 and Apple's L1 and L2. Of course there is a major caveat here. Apple's chips are manufactured on TSMC's N3E while AMD's are manufactured on N4P (maybe N4X for the Halo, more on that below). Quantifying how much that matters, how many transistors each chip uses, is nearly impossible. We can try to use references (1 and 2) to estimate the density difference, but that is so dependent on the proportion of SRAM to logic the even knowing transistor count of the whole die doesn't tell us much. SRAM isn't expected to change in density, but logic did substantially (although even things annotated as cache can contain logic, as Apple's 16MB cache is 30% smaller than AMD's 16MB cache on the Halo). By my estimations, Apple's chips are anywhere from 15-40% more dense than AMD's depending on what values you want to take, with I believe TSMC themselves giving a rough decrease of 30% of N3E compared to N5 for their unknown reference chip (and N4P/X being <6% smaller than N5 since SRAM didn't change at all and logic decreased by 6% - although Granite Ridge is supposedly quite a bit denser than its N5 predecessor so 🤷‍♂️).

So how does Strix Halo compare to AMD's Granite Ridge (desktop Ryzen)? My guess is they are actually manufactured on different processes. N4X and N4P are nearly identical and so if you look at die shots the two CPUs are exactly the same, but the two dies outside of the CPU looks ever so slightly different. Further what N4X adds to N4P are a couple of extra transistor types for both extra low power and high power but more leakage. Look at the desktop Ryzen 9950X, it can achieve a score of 2254 in CB R24. According to NBC, it gets 7.18 pts/W, which, subtracting out 100W idle power since it was connected to a 4090 at the time, means it used 213W above idle to achieve that score. Meanwhile Strix Halo's performance is already tapering at much lower wattages - going from 101W wall power to 117W wall power (16%) only results in 5% extra performance (117W off chart). We saw Strix Point behave the same way, eventually maxing out at just over 1200pts. Given how far 2254 and 1731 are it seems difficult to see how Halo could reach that even with double the power with such small gains (which will continue to get smaller as power increases). Now, this was measured in the tablet hybrid form factor and there is dispute online whether Granite Ridge itself is N4X or N4P, so this isn't a sure thing. But it seems to me, with the slight differences in die and the seemingly different performance characteristics, it looks reasonable to me that something is different between the desktop Ryzen and Strix Halo chips where the latter is probably geared towards performance at lower power, while the former is geared towards absolute performance.

Bottom line: Apple has a huge advantage in terms of performance/Watt AND performance/Area - perhaps not hugely surprising since the amount of power used should be proportional to the amount of silicon die area turned on, but still AMD needs far bigger CPUs to compete with Apple. This probably translates increased profit margins for Apple (with the caveat that Apple probably spends a lot more per die being first on the most advanced node), especially since they package their chips themselves without an OEM middle man. This ability to build smaller, but still performant CPUs, is also probably one factor that allows them to more cost effectively build something like the M4 Max (and Ultra or whatever Hidra ends up being) whereas the Strix Halo, for all its size is closer to a M4 Pro competitor. It isn't clear beyond consoles if there is an appetite in the PC space for larger SOCs and indeed the Strix Halo itself is as of yet unproven in the marketplace.
 
Last edited:
Blender Results posted:

Screenshot 2025-02-28 at 10.46.22 AM.png


Not great, but not unexpected given RDNA 3 and 3.5's previous Blender results. RDNA 4 should improve things, though it isn't clear when RDNA 4 will come to mobile.
 
Back
Top