# Nuvia: don’t hold your breath



## Cmaier

Qualcomm takes on Apple Silicon with next-gen PC chips designed by former Apple engineers
					

Qualcomm is ramping up its efforts to compete with Apple’s new M-series chips for the Mac. The chip maker today held its 2021 investor day event, focused on unveiling its “next-generation” technology, including new CPU hardware designed by Nuvia. For those keeping track, Nuvia is the chip...




					9to5mac.com
				




Nothing until 2023?


----------



## Citysnaps

Threats of more Apple legal action?


----------



## Cmaier

citypix said:


> Threats of more Apple legal action?



Well I doubt it beyond the existing lawsuit.


----------



## Citysnaps

How about...

2023 is the year when computers using Qualcomm cpus would be released to the public, presumably in volume.  Leaving 2022 and a portion of 2021 for chip development, initial/trial fabrication (presumably from Intel) and characterization by Qualcomm with samples delivered to its customers. And then if everything went well, fabrication in production quantities for Qualcomm customers.

Is that amount of time unreasonable?  That was basically how our process played out, but we were a very tiny company.


----------



## thekev

> The lawsuit from Apple alleges that Williams exploited Apple technology and poached other Apple employees to join him at Nuvia.
> 
> Williams then fired back with his own lawsuit, saying that Apple illegally monitored his text messages and that his so-called “breach of contract” is unenforceable. There has been no resolution in this lawsuit yet.




If he exploited trade secrets, that's probably something. Non-compete and similar agreements are a scourge and should really be unenforceable everywhere, not just California. They are usually unenforceable here.


----------



## Cmaier

thekev said:


> If he exploited trade secrets, that's probably something. Non-compete and similar agreements are a scourge and should really be unenforceable everywhere, not just California. They are usually unenforceable here.




I read the complaint, and I believe he is alleged to have used company resources and started the new company on Apple’s dime? I may be misremembering. I think he was recruiting folks from Apple while working at Apple (perhaps through an alleged straw man, my former colleague Manu). 

The issue is not breach of any non-compete, though.


----------



## Nycturne

citypix said:


> How about...
> 
> 2023 is the year when computers using Qualcomm cpus would be released to the public, presumably in volume.  Leaving 2022 and a portion of 2021 for chip development, initial/trial fabrication (presumably from Intel) and characterization by Qualcomm with samples delivered to its customers. And then if everything went well, fabrication in production quantities for Qualcomm customers.
> 
> Is that amount of time unreasonable?  That was basically how our process played out, but we were a very tiny company.




Not unreasonable in terms of development, but may pose issues in keeping up with the competition. It also telegraphs your moves to those competitors which can now move to try to block you out before you even get started.

Here’s the thing, it seems a bit weird to me to use this acquisition to go after the PC space, while ignoring the mobile space where the best Android devices still can’t seemingly keep up with Apple in terms of performance. And to spend time comparing to Apple is weird since it’s really Intel/AMD that will be their competition. This move all up seems more to act as a lever to get their SoCs into the PC space where they are effectively a non-presence today (i.e. larger growth potential). Considering their tendency to ask for royalties on top of the SoC price, I’m not sure how well that will fly in the PC space, but certainly seems like a juicy target for Qualcomm. 

That said, I’ve never been super great at the whole “keep shareholders happy” aspect of business. Maybe I’m missing something.


----------



## Cmaier

citypix said:


> How about...
> 
> 2023 is the year when computers using Qualcomm cpus would be released to the public, presumably in volume.  Leaving 2022 and a portion of 2021 for chip development, initial/trial fabrication (presumably from Intel) and characterization by Qualcomm with samples delivered to its customers. And then if everything went well, fabrication in production quantities for Qualcomm customers.
> 
> Is that amount of time unreasonable?  That was basically how our process played out, but we were a very tiny company.




They seem to be claiming the chips were already “redesigned,” which can’t be right otherwise they would be available much sooner.  It used to take us 2+ years to design chips (from scratch - not spins) with only 100 million transistores at AMD.  Presumably QC’s chip will have tens of billions of transistors. The good news is that Apple has shown you can do it in about a year and a half. (Presumably they leave some performance on the table doing it that way. I think I’ll start a thread about how CPU design works and the trade offs between ASIC and custom methodology).


----------



## Cmaier

Nycturne said:


> Not unreasonable in terms of development, but may pose issues in keeping up with the competition. It also telegraphs your moves to those competitors which can now move to try to block you out before you even get started.
> 
> Here’s the thing, it seems a bit weird to me to use this acquisition to go after the PC space, while ignoring the mobile space where the best Android devices still can’t seemingly keep up with Apple in terms of performance. And to spend time comparing to Apple is weird since it’s really Intel/AMD that will be their competition. This move all up seems more to act as a lever to get their SoCs into the PC space where they are effectively a non-presence today (i.e. larger growth potential). Considering their tendency to ask for royalties on top of the SoC price, I’m not sure how well that will fly in the PC space, but certainly seems like a juicy target for Qualcomm.
> 
> That said, I’ve never been super great at the whole “keep shareholders happy” aspect of business. Maybe I’m missing something.




The telegraphing is a very interesting point. If they had something, they’d just shut up and surprise everyone when it’s ready.  They are trying to freeze the market precisely because they don’t have confidence.


----------



## Yoused

If the Snapdragon 920 is ARMv9, they can put it out as AArch64 only, which would save them a big wad of cruft, I suspect. Supporting AArch32+Thumb has got to cost something.


----------



## Cmaier

Yoused said:


> If the Snapdragon 920 is ARMv9, they can put it out as AArch64 only, which would save them a big wad of cruft, I suspect. Supporting AArch32+Thumb has got to cost something.




I think with thumb it would be difficult for them to match the wide issue of M1, unless they limit it to some cores, or add a pipe stage for pre-decode. Hard to know for sure without sitting down to sketch out the decode logic. Not nearly as bad as x86, of course, because at least you’re talking about just a couple integer multiple widths. Probably some other implications throughout the pipeline as well.


----------



## Yoused

I kind of wonder if Apple ever supported Thumb in the A-series CPUs. Seems like it would have been pointless.


----------



## Cmaier

Yoused said:


> I kind of wonder if Apple ever supported Thumb in the A-series CPUs. Seems like it would have been pointless.



Maybe A4 or earlier, since they were using Arm soft IP?


----------



## Yoused

A4 or earlier? A4 was the first one. Before that they were using bought SoCs. Wiki-thingy, FWIW, says that they supported A32/T32 from A6 (first real in-house design) through A10, but T32 may be an unfounded assumption. Whoever wrote that probably has no solid proof that there really was T32 in there, unless it can be shown that the xCode/Swift compiler will generate Thumb code.


----------



## Cmaier

Yoused said:


> A4 or earlier? A4 was the first one. Before that they were using bought SoCs. Wiki-thingy, FWIW, says that they supported A32/T32 from A6 (first real in-house design) through A10, but T32 may be an unfounded assumption. Whoever wrote that probably has no solid proof that there really was T32 in there, unless it can be shown that the xCode/Swift compiler will generate Thumb code.




Interesting, Pretty sure they never used it for anything, if it existed.


----------



## Nycturne

Cmaier said:


> Interesting, Pretty sure they never used it for anything, if it existed.




Oddly enough, Apple still has their documentation up on ARMv6 and ARMv7 discussing some details on how to avoid getting caught by a gotcha in thumb mode. But it mostly looks like it was meant for folks writing some assembly by hand. I don't remember Xcode ever letting you specify thumb mode all up, but I could be wrong.



Cmaier said:


> The telegraphing is a very interesting point. If they had something, they’d just shut up and surprise everyone when it’s ready.  They are trying to freeze the market precisely because they don’t have confidence.




If that's their play, I'm not sure it'll work to their advantage. Anyone _willing_ to step into the AMD/Intel fight now isn't going to be dissuaded by Qualcomm saying they'll have something in 24 months.


----------



## Cmaier

Nycturne said:


> Oddly enough, Apple still has their documentation up on ARMv6 and ARMv7 discussing some details on how to avoid getting caught by a gotcha in thumb mode. But it mostly looks like it was meant for folks writing some assembly by hand. I don't remember Xcode ever letting you specify thumb mode all up, but I could be wrong.
> 
> 
> 
> If that's their play, I'm not sure it'll work to their advantage. Anyone _willing_ to step into the AMD/Intel fight now isn't going to be dissuaded by Qualcomm saying they'll have something in 24 months.




Who knows.


----------



## thekev

Cmaier said:


> I read the complaint, and I believe he is alleged to have used company resources and started the new company on Apple’s dime? I may be misremembering. I think he was recruiting folks from Apple while working at Apple (perhaps through an alleged straw man, my former colleague Manu).
> 
> The issue is not breach of any non-compete, though.




So yeah, that stuff is actually an issue, albeit a bit Machiavellian, which actually brings me some amusement. I always figured The Prince was just some guy writing about his cat.


----------



## NT1440

What are the odds that Qualcomm’s chips will be “good enough” for the windows world? Any takes on whether it’s feasible these will be near/exceed M1 (which will be old hat by then) performance?

Basically, will this make Windows machines suck or will it just be a new flavor of suckage?


----------



## Cmaier

NT1440 said:


> What are the odds that Qualcomm’s chips will be “good enough” for the windows world? Any takes on whether it’s feasible these will be near/exceed M1 (which will be old hat by then) performance?
> 
> Basically, will this make Windows machines suck or will it just be a new flavor of suckage?




Hard to guess. I would imagine they would be “good enough” for at least some part of the Windows market.  The question is whether “good enough” is compelling enough to get anyone to switch from x86.  Qualcomm has provided little guidance about what they are trying to achieve.  But it seems to me that in order to sell Arm to windows customers you have to provide something pretty compelling, so they should be aiming to be competitive performance-wise, but at a much lower power expenditure.  

I suspect we will see M1-like performance, for what it’s worth, but in 2023 that may not be good enough.


----------



## Andropov

I don't think their current offering is competitive enough for the Windows market to massively switch to ARM. Unless I'm missing something, the Snapdragon 888 (released late 2020) scores just 966 points in Geekbench 5 single core and 3044 in multicore... slower than Apple's A12 or Intel's i3-8200U.

At least they use a lot less power. But it's going to be difficult to pull off a 70% increase in performance in 2-3 years (to match current M1 performance).


----------



## Cmaier

Andropov said:


> I don't think their current offering is competitive enough for the Windows market to massively switch to ARM. Unless I'm missing something, the Snapdragon 888 (released late 2020) scores just 966 points in Geekbench 5 single core and 3044 in multicore... slower than Apple's A12 or Intel's i3-8200U.
> 
> At least they use a lot less power. But it's going to be difficult to pull off a 70% increase in performance in 2-3 years (to match current M1 performance).



Well, I have to assume they aren’t basing this new thing in any way on their existing designs.  It’s probably going to have a microarchitecture similar to M1, with wide issue.  

Questionable what they do for GPU, of course. And almost certainly not a unified memory architecture.  

From their perspective, if you expect to just be supplying one SoC and there is going to be a separate GPU, RAM, etc., you can crank up the clock speeds well past what M1 does and trade off some power consumption to try and make up for any microarchitecture deficiencies.


----------



## Yoused

Advanced Micro Devices has built ARM devices in the past, have they not? What is the possibility that they might try playing both sides of the fence?


----------



## Cmaier

Yoused said:


> Advanced Micro Devices has built ARM devices in the past, have they not? What is the possibility that they might try playing both sides of the fence?




I know we had an architecture license, but we never used it when I was there. I think maybe after I left there was something?


----------



## Andropov

Meanwhile, Qualcomm has another problem to worry about: MediaTek announces Dimensity 9000.


----------



## Cmaier

Qualcomm’s M1-class laptop chips will be ready for PCs in “late 2023” — Ars Technica
					

Qualcomm's early-2021 Nuvia acquisition is taking time to bear fruit.




					apple.news
				




I’m not sure why matching M1 when M3 is out would be “impressive,” but whatever.

(“sometime in 2023” now appears to be “late” in 2023.)

Also, from the perspective of a chip designer, that sounds like they are still a year from tape out, which is wild.


----------



## Yoused

I paid a brief visit to GB. Their scoring methodology seems less than ideal. I see Snapdragon 8 Gen 1 scoring as high as 1161 SC on Android (about the same as a Xeon W) which is way behind M1 and the MC scores are not worth mentioning. The Gen 2 scores under 800, but that is running Windows. It looks like a non-small part of M1's performance lead is the OS itself, along with the features Apple has embedded to optimize it for macOS. The OS itself seems to be a significant factor in performance, which is a bit of a problem.

I imagine that Qualcomm is planning wide pipes with a massive ROB – basically just copying what has worked for Apple. Their claims as of now are based on chalkboard estimates. There is almost certainly some unpublished magic Apple is using that others will struggle to divine.


----------



## Colstan

Yoused said:


> It looks like a non-small part of M1's performance lead is the OS itself, along with the features Apple has embedded to optimize it for macOS. The OS itself seems to be a significant factor in performance, which is a bit of a problem.



@Cmaier has mentioned how his team at AMD "worked closely with" Microsoft to implement x86-64 in Windows, but Apple's integration between macOS and the M-series must be on another level. That's probably a competitive advantage that is impossible to benchmark, not just in terms of raw performance, but the user experience. Since Qualcomm appears to be Microsoft's chosen partner for Windows-on-ARM, they are likely working closely with them on implementing Windows on these new Nuvia chips. However, they did the same with Intel for Alder Lake, and the Windows 11 scheduler has shown to have little to no improvement when working with Intel's 12-gen Core series.

Also, there has been a lot of speculation about whether these quick synthetic benchmarks are really taking advantage of everything Apple Silicon has to offer. It takes time to properly benchmark new hardware, particularly if it is outside the Windows hegemony. Craig Hunter has just released an informative review which shows the M1 Ultra putting the Intel Mac Pro to shame using a fluid dynamics benchmark. The Ultra's scaling is practically a straight line upward, while a 28-core Xeon quickly tapers off in efficiency. I particularly like how he describes the Mac Studio Ultra as an 8x8x4" supercomputer, and he points out that we still have the Apple Silicon Mac Pro on the horizon, so this is just the tip of the iceberg.


----------



## Cmaier

Colstan said:


> @Cmaier has mentioned how his team at AMD "worked closely with" Microsoft to implement x86-64 in Windows, but Apple's integration between macOS and the M-series must be on another level. That's probably a competitive advantage that is impossible to benchmark, not just in terms of raw performance, but the user experience. Since Qualcomm appears to be Microsoft's chosen partner for Windows-on-ARM, they are likely working closely with them on implementing Windows on these new Nuvia chips. However, they did the same with Intel for Alder Lake, and the Windows 11 scheduler has shown to have little to no improvement when working with Intel's 12-gen Core series.
> 
> Also, there has been a lot of speculation about whether these quick synthetic benchmarks are really taking advantage of everything Apple Silicon has to offer. It takes time to properly benchmark new hardware, particularly if it is outside the Windows hegemony. Craig Hunter has just released an informative review which shows the M1 Ultra putting the Intel Mac Pro to shame using a fluid dynamics benchmark. The Ultra's scaling is practically a straight line upward, while a 28-core Xeon quickly tapers off in efficiency. I particularly like how he describes the Mac Studio Ultra as an 8x8x4" supercomputer, and he points out that we still have the Apple Silicon Mac Pro on the horizon, so this is just the tip of the iceberg.



Hunters graphs are shocking. As he notes, imagine what the Mac Pro chip will do.


----------



## SuperMatt

Colstan said:


> @Cmaier has mentioned how his team at AMD "worked closely with" Microsoft to implement x86-64 in Windows, but Apple's integration between macOS and the M-series must be on another level. That's probably a competitive advantage that is impossible to benchmark, not just in terms of raw performance, but the user experience. Since Qualcomm appears to be Microsoft's chosen partner for Windows-on-ARM, they are likely working closely with them on implementing Windows on these new Nuvia chips. However, they did the same with Intel for Alder Lake, and the Windows 11 scheduler has shown to have little to no improvement when working with Intel's 12-gen Core series.
> 
> Also, there has been a lot of speculation about whether these quick synthetic benchmarks are really taking advantage of everything Apple Silicon has to offer. It takes time to properly benchmark new hardware, particularly if it is outside the Windows hegemony. Craig Hunter has just released an informative review which shows the M1 Ultra putting the Intel Mac Pro to shame using a fluid dynamics benchmark. The Ultra's scaling is practically a straight line upward, while a 28-core Xeon quickly tapers off in efficiency. I particularly like how he describes the Mac Studio Ultra as an 8x8x4" supercomputer, and he points out that we still have the Apple Silicon Mac Pro on the horizon, so this is just the tip of the iceberg.






Wow. Intel hasn’t just been passed. They have been lapped.


----------



## Andropov

Colstan said:


> Also, there has been a lot of speculation about whether these quick synthetic benchmarks are really taking advantage of everything Apple Silicon has to offer. It takes time to properly benchmark new hardware, particularly if it is outside the Windows hegemony. Craig Hunter has just released an informative review which shows the M1 Ultra putting the Intel Mac Pro to shame using a fluid dynamics benchmark. The Ultra's scaling is practically a straight line upward, while a 28-core Xeon quickly tapers off in efficiency. I particularly like how he describes the Mac Studio Ultra as an 8x8x4" supercomputer, and he points out that we still have the Apple Silicon Mac Pro on the horizon, so this is just the tip of the iceberg.



Whoa! That was a nice read. Much more substantial improvement than most other benchmarks I've seen, IIRC. I wonder how many real life tasks would get the same benefits. Just memory bound ones?


----------



## Colstan

Cmaier said:


> Hunters graphs are shocking. As he notes, imagine what the Mac Pro chip will do.



Please correct me if I am wrong, but I believe your latest prediction for the Apple Silicon Mac Pro is for an M2 "Extreme" with up to 1.0TB of unified memory? Basically, four M2 Max dies linked using next generation UltraFusion interconnects? While I'm no CPU architect, that sounds like a reasonable assumption and would make the Mac Pro a powerful, capable, scalable machine. If it follows the same trend as the M1 series, then it should easily be the most efficient workstation in existence, assuming it is released within the next year or so.

Now, that hasn't stopped some people from fantasy designing their own SoC. Evidently, everyone is a CPU architect these days. This isn't just from MR, but folks who should know better, such as over at the Ars Technica forums. One common solution that I have heard, for matching the 1.5TB maximum system memory of the current Intel version, is that Apple will start using external DIMMs, just for the Mac Pro, no other Macs. I've even heard some insist that Apple will resort to implementing HBM2 in such a solution. I've also seen many insisting that, since AMD will be announcing the RX 7000 series with RDNA3 later this year, that the Mac Pro will feature the return of discrete graphics. (That would also be in direct opposition to another @Cmaier prediction that the M2 would perhaps feature ray tracing.) That theory is that Apple will use MPX modules to support upgradeability for such features. There's also a persistent rumor of one last Intel version with an Ice Lake Xeon, secretly wandering Cupertino's hallways like a forlorn x86 revenant.

Of course, at the top of the wish list is always the return of Boot Camp support for Windows, assuming Microsoft doesn't renew its ARM exclusivity contract with Qualcomm. This is despite Craig Federighi* specifically stating that Apple won't support direct booting of other operating systems, and that their solution is virtual machines. Sure, Apple has made a few unnoficial accommodations for Asahi Linux during the boot process, but those were minor tweaks to make it easier for that project. Apple's Rosetta engineers have helped CodeWeavers support 32-bit programs with CrossOver on Apple Silicon Macs. However, these are implementations that don't involve a shift in strategy or substantial engineering resources. Every indication suggests that VMs and WINE are considered satisfactory solutions, from Apple's standpoint.

(*Craig said that in an interview with Gruber that native Windows ain't happening. I timestamped the exact quote because I constantly hear about the inevitable return of Boot Camp. Even then, some folks refuse to believe, despite them hearing it straight from Apple's senior vice president of software engineering, who is literally the decision maker for such things.)

None of that matches Apple's strategy thus far, in fact all public indications appear to be the opposite, but I suppose hope springs eternal. I think the Apple Silicon Mac Pro is the last hope for the return of these features, so a lot of people are projecting their personal desires onto it, which would then theoretically spread to the other models. The Mac Pro is the pinnacle of Apple's Mac line, so it is the penultimate symbol for a personal wish list. I'm not a CPU designer, but @Cmaier is, so I'm wondering if he sees any logical reason for Apple drastically altering its designs to accommodate any of these features, which appear entirely regressive, from my perspective? Perhaps there is something that I am missing in this debate, and the Apple Silicon Mac Pro will be more exotic than I am picturing?

I realize that, five years from now, people will still be asking for eGPU support, Boot Camp, easy internal upgrades, and a free pony, but it's best to dispel such notions whenever possible. What these people desire already exists. It's called a PC.


----------



## Cmaier

Colstan said:


> Please correct me if I am wrong, but I believe your latest prediction for the Apple Silicon Mac Pro is for an M2 "Extreme" with up to 1.0TB of unified memory? Basically, four M2 Max dies linked using next generation UltraFusion interconnects? While I'm no CPU architect, that sounds like a reasonable assumption and would make the Mac Pro a powerful, capable, scalable machine. If it follows the same trend as the M1 series, then it should easily be the most efficient workstation in existence, assuming it is released within the next year or so.
> 
> Now, that hasn't stopped some people from fantasy designing their own SoC. Evidently, everyone is a CPU architect these days. This isn't just from MR, but folks who should know better, such as over at the Ars Technica forums. One common solution that I have heard, for matching the 1.5TB maximum system memory of the current Intel version, is that Apple will start using external DIMMs, just for the Mac Pro, no other Macs. I've even heard some insist that Apple will resort to implementing HBM2 in such a solution. I've also seen many insisting that, since AMD will be announcing the RX 7000 series with RDNA3 later this year, that the Mac Pro will feature the return of discrete graphics. (That would also be in direct opposition to another @Cmaier prediction that the M2 would perhaps feature ray tracing.) That theory is that Apple will use MPX modules to support upgradeability for such features. There's also a persistent rumor of one last Intel version with an Ice Lake Xeon, secretly wandering Cupertino's hallways like a forlorn x86 revenant.
> 
> Of course, at the top of the wish list is always the return of Boot Camp support for Windows, assuming Microsoft doesn't renew its ARM exclusivity contract with Qualcomm. This is despite Craig Federighi* specifically stating that Apple won't support direct booting of other operating systems, and that their solution is virtual machines. Sure, Apple has made a few unnoficial accommodations for Asahi Linux during the boot process, but those were minor tweaks to make it easier for that project. Apple's Rosetta engineers have helped CodeWeavers support 32-bit programs with CrossOver on Apple Silicon Macs. However, these are implementations that don't involve a shift in strategy or substantial engineering resources. Every indication suggests that VMs and WINE are considered satisfactory solutions, from Apple's standpoint.
> 
> (*Craig said that in an interview with Gruber that native Windows ain't happening. I timestamped the exact quote because I constantly hear about the inevitable return of Boot Camp. Even then, some folks refuse to believe, despite them hearing it straight from Apple's senior vice president of software engineering, who is literally the decision maker for such things.)
> 
> None of that matches Apple's strategy thus far, in fact all public indications appear to be the opposite, but I suppose hope springs eternal. I think the Apple Silicon Mac Pro is the last hope for the return of these features, so a lot of people are projecting their personal desires onto it, which would then theoretically spread to the other models. The Mac Pro is the pinnacle of Apple's Mac line, so it is the penultimate symbol for a personal wish list. I'm not a CPU designer, but @Cmaier is, so I'm wondering if he sees any logical reason for Apple drastically altering its designs to accommodate any of these features, which appear entirely regressive, from my perspective? Perhaps there is something that I am missing in this debate, and the Apple Silicon Mac Pro will be more exotic than I am picturing?
> 
> I realize that, five years from now, people will still be asking for eGPU support, Boot Camp, easy internal upgrades, and a free pony, but it's best to dispel such notions whenever possible. What these people desire already exists. It's called a PC.



I agree with all of this. 

It’s possible that apple allows slotted ram and puts its own gpu on a separate die, sure. But if it does that it will still be a shared memory architecture. I would say there’s a 1 percent chance of slotted RAM. An independent GPU is more likely; the technical issues with that are not very big, but the economics don’t make much sense given apple’s strategy of leveraging its silicon across all products. Still, I’d give that a 33 percent chance. And it wouldn’t be a plug in card or anything - just a separate GPU die in the package using something like fusion interconnect. Maybe for iMac Pro, Mac studio and Mac Pro.


----------



## Colstan

Cmaier said:


> An independent GPU is more likely; the technical issues with that are not very big, but the economics don’t make much sense given apple’s strategy of leveraging its silicon across all products. Still, I’d give that a 33 percent chance.



Thanks for the answer. In terms of a GPU, back in 2020 there was a report from the China Times about Apple making a GPU codenamed "Lifuka". We really haven't heard anything about it since then. Whether that was referring to the internal GPU, or a discrete design, still isn't clear. If Apple does implement an independent GPU, then would it be more likely for them to use their own design over a third-party, such as AMD? Assuming it was true to begin with, because the rumor claimed it was being designed for an iMac.


----------



## Cmaier

Colstan said:


> Thanks for the answer. In terms of a GPU, back in 2020 there was a report from the China Times about Apple making a GPU codenamed "Lifuka". We really haven't heard anything about it since then. Whether that was referring to the internal GPU, or a discrete design, still isn't clear. If Apple does implement an independent GPU, then would it be more likely for them to use their own design over a third-party, such as AMD? Assuming it was true to begin with, because the rumor claimed it was being designed for an iMac.



Yeah, definitely their own design. I’m quite convinced they like their architecture, and that they have been working on ray tracing.  Given how parallelizable GPU stuff is, it’s quite possible that they simply put together a die that is just made up of a ton of the same GPU cores they have on their SoCs.  You could imagine that, for modular high end machines, instead of partitioning die like: [CPU cores+GPU cores][CPU cores+GPU cores]…    it may make more economic sense to do [CPU cores][CPU cores]…[GPU cores][GPU cores]….   (Or, even, [CPU cores+GPU cores][CPU cores+GPU cores]…[GPU cores]…


It may also make more engineering sense, in terms of latencies, power supply, and cooling, too.   Of course, Apple wouldn’t do that if it was only for Mac Pro (probably) because the economies of scale wouldn’t work (plus, now, supply chains are fragile).  They might do it if it made sense to use this type of partitioning for iMacs, iMac Pros, Studios,  Mac Pros, and maybe high end MacBook Pros, while using the current partitioning for iPads, iPhone Pros (maybe), Mac Minis, MacBook Pros, MacBooks, and maybe low end iMacs. 

Not saying they will, but at least i give it a chance. More of a chance than RAM slots or third-party GPUs.


----------



## Renzatic

Andropov said:


> Meanwhile, Qualcomm has another problem to worry about: MediaTek announces Dimensity 9000.




That's a Reverend Horton Heat song, isn't it?


----------



## thekev

Renzatic said:


> That's a Reverend Horton Heat song, isn't it?



Horton?


----------



## Andropov

Cmaier said:


> Yeah, definitely their own design. I’m quite convinced they like their architecture, and that they have been working on ray tracing.



Since Apple is designing their GPU/CPU cores to be 'mobile first' I wonder how's raytracing going to fit into that perspective. Sure there's a lot you can do with 16-cores worth of raytracing resources (like, say, a 'Max' level GPU), but would a 4-core GPU (like in an iPhone configuration) be enough to do anything useful with raytracing? I'm under the (possibly wrong) impression that unless you have a 'critical mass' of raytracing performance available, you may as well have zero raytracing capabilities. Especially on mobile, where using a marginal raytracing capability to render some realistic reflections here and there is not going to make much of a difference.

A critical point (for what I use Metal for) would be having enough raytracing power to dump all the shadow mapping and ambient occlusion kernels and just use raytraced lighting instead. I understand it's easier to start small, but I wonder if going any less than all-in on raytracing would just take valuable die area from the GPU cores that could be more effectively dedicated to improving traditional shading capabilities. But I don't know how many additional transistors would the hardware-accelerated raytracing stuff require, maybe starting small doesn't take that much die area.

I guess Apple could also have different GPU cores for M-series and A-series, but that'd go against what they're currently doing.


----------



## Cmaier

Andropov said:


> Since Apple is designing their GPU/CPU cores to be 'mobile first' I wonder how's raytracing going to fit into that perspective. Sure there's a lot you can do with 16-cores worth of raytracing resources (like, say, a 'Max' level GPU), but would a 4-core GPU (like in an iPhone configuration) be enough to do anything useful with raytracing? I'm under the (possibly wrong) impression that unless you have a 'critical mass' of raytracing performance available, you may as well have zero raytracing capabilities. Especially on mobile, where using a marginal raytracing capability to render some realistic reflections here and there is not going to make much of a difference.
> 
> A critical point (for what I use Metal for) would be having enough raytracing power to dump all the shadow mapping and ambient occlusion kernels and just use raytraced lighting instead. I understand it's easier to start small, but I wonder if going any less than all-in on raytracing would just take valuable die area from the GPU cores that could be more effectively dedicated to improving traditional shading capabilities. But I don't know how many additional transistors would the hardware-accelerated raytracing stuff require, maybe starting small doesn't take that much die area.
> 
> I guess Apple could also have different GPU cores for M-series and A-series, but that'd go against what they're currently doing.




 Yep. There are a lot of reasons that it makes sense to have some sort of split between “low end” and “high end.”  Where you draw that split is a decision that needs to take into account both economics and physical practicalities.  I could imagine a world where anything below a MBP doesn’t get ray tracing and anything above does. But I imagine Apple will put it in iPad.  I also imagine Apple is working on making it work even in its VR/AR goggles, so they probably have found a way to get it done without requiring too much in the way of silicon resources. 
But does an IPhone need ray tracing? Probably not any time soon. Would they love to include it an harp on how revolutionary it is? Yep.  Can you do it in a die that meets the power and thermal requirements of an iPhone? I would wager you can.  

To me, really, the wildcard is their VR goggle architecture.  If they think you need it for that, and if they do the rendering on the device itself or on a coupled iPhone, then that will drive what they choose to do.


----------



## Colstan

One more question @Cmaier, if I may. We've seen Intel follow Apple's lead with Alder Lake implementing big.LITTLE aka heterogeneous computing. The latest rumors claim that AMD is going the same route with Zen 5. What do you think the chances are of Apple taking a page from the x86 guys and implementing SMT? Does it make sense for their design, and if so, do you think we'd see it in both the performance and efficiency cores?


----------



## Cmaier

Colstan said:


> One more question @Cmaier, if I may. We've seen Intel follow Apple's lead with Alder Lake implementing big.LITTLE aka heterogeneous computing. The latest rumors claim that AMD is going the same route with Zen 5. What do you think the chances are of Apple taking a page from the x86 guys and implementing SMT? Does it make sense for their design, and if so, do you think we'd see it in both the performance and efficiency cores?




I don’t think it makes a lot of sense for Apple, given that Apple seems to have no trouble keeping every ALU busy at all times as it is.  I’ve always felt that SMT was a crutch for designs where you can’t keep the ALUs busy because you have to many interruptions of the pipeline.  x86 benefits from SMT because instruction decoding is so hard that you end up creating bubbles in the front end of the instruction stream.  You fetch X bytes from the instruction cache and you never know how many instructions that corresponds to. 

SMT on Arm, or at least on Apple’s processors so far, would just mean you are stopping one thread that was perfectly capable of continuing to run, in order to substitute in another.  And paying the overhead for that swap.  I think it would be a net negative.

That said, one could imagine doing SMT on the efficiency cores if the calculations show that you save power by reducing the complexity of the decode/dispatch hardware (thus creating bubbles) but can get back some performance without using up that power savings by doing SMT.  That said, SMT also has other issues that need to be considered, including the likelihood that any SMT implementation will be susceptible to side channel attacks (and that mitigating against such attacks may require taking steps that mean the benefit is even less).


----------



## Yoused

AAUI, Intel's motivation for going with HT was largely based on their object code design. In order to ramp up the clock, they needed a long, skinny pipe. Of course, bubbles and stalls are a serious problem for such a pipe. Hence, by feeding two streams into the pipe side-by-side, one stream could grab the gaps in the other stream and make use of that lost time.

ARMv8/9 uses out-of-order flow, which means we do what we can and stuff that takes longer, we get around to that or do other stuff while we are waiting for that to finish. OoOE is simply the main alternative to SMT, and it generally gets the job done more efficiently. Of course, having a code structure that is conducive to a very wide pipe makes OoOE easier to implement. Going out-of-order with x86 would be a major and costly effort.

The biggest problem with relying on SMT to fill your gaps is that it does not seem to be a performance enhancer. Heavy load work tends to have less of the bubble-generating type code, so you end up with some extra logic sucking some extra juice in order to feed two streams into one pipe and keep the output coherent. The heavy jobs seem to have a net performance gain of zero at best, often negative (at least as compared to having 2 discrete cores).

It might make sense to use it on the E-cores, where the loads are more likely to favor it, but E-cores are already tiny, and P-cores are probably not going to show gains from it, so why bother? The thing I could see them doing is a sort of _Asymmetric_ Multi-Threading design.

Apple put in GCD a decade ago, which was designed to allow programs to make use of an abitrary number of cores in the most agnostic way possible. Workloads can be parcelled into a queue, each job or job fragment dispatched to an available core at an appropriate time. Hence, they could, in theory, design a P-core with a second context frame that would allow an incoming job to flow seamlessly into the tail of the ending job that was running on the core, relying on release/acquire memory semantics (which would have to be in the code) to eliminate even the need for hard memory barriers.

GCD is a brilliant OS feature that should be used to the fullest in apps that do a lot of heavy work. On the other hand, the heaviest work is handled outside the CPU cores anyway, so gains in CPU performance are becoming less dramatic in the real world than gains in GPU and ANE performance.


----------



## Nycturne

Yoused said:


> OoOE is simply the main alternative to SMT, and it generally gets the job done more efficiently. Of course, having a code structure that is conducive to a very wide pipe makes OoOE easier to implement. Going out-of-order with x86 would be a major and costly effort.




I could have sworn modern x86 was already OoO and has been for some time, dating back to the P6 architecture in the 90s. SMT can still complement OoOE, especially in designs when the ALUs aren’t being fully utilized for one reason or another.

Is my memory tricking me here? Because I see multiple CPU designs that use SMT and OoOE, IBM’s POWER8 having 8 thread SMT as an example. 



Yoused said:


> It might make sense to use it on the E-cores, where the loads are more likely to favor it, but E-cores are already tiny, and P-cores are probably not going to show gains from it, so why bother? The thing I could see them doing is a sort of _Asymmetric_ Multi-Threading design.



What’s your thinking on what AMT is in this case?



Yoused said:


> Apple put in GCD a decade ago, which was designed to allow programs to make use of an abitrary number of cores in the most agnostic way possible. Workloads can be parcelled into a queue, each job or job fragment dispatched to an available core at an appropriate time. Hence, they could, in theory, design a P-core with a second context frame that would allow an incoming job to flow seamlessly into the tail of the ending job that was running on the core, relying on release/acquire memory semantics (which would have to be in the code) to eliminate even the need for hard memory barriers.
> 
> GCD is a brilliant OS feature that should be used to the fullest in apps that do a lot of heavy work. On the other hand, the heaviest work is handled outside the CPU cores anyway, so gains in CPU performance are becoming less dramatic in the real world than gains in GPU and ANE performance.



On this note, I think GCD is about to be handed its hat. Swift concurrency, while a different conceptual model, could very well be considered the spiritual successor to GCD. Improvements such as addressing the problem of thread explosions, less thread blocking and the use of actors to control simultaneous data access. And with Swift 5.6 they are making clear steps towards being able to catch concurrency issues at compile time, such as catching sending of data between tasks in an unsafe way. It’s different, but it does close some of the gaps left by GCD, and bakes it right into the language and runtime.

The biggest hurdle for me getting used to it is moving out of the mindset of using a serial queue for data access (something you still do for CoreData, unfortunately), and getting used to actors.


----------



## Andropov

Nycturne said:


> I could have sworn modern x86 was already OoO and has been for some time, dating back to the P6 architecture in the 90s. SMT can still complement OoOE, especially in designs when the ALUs aren’t being fully utilized for one reason or another.



Yup, that was my understanding as well. I remember reading somewhere that OoOE is still more difficult on x86 though, since when x86 instructions are 'broken down' to simpler instructions those resulting instructions are more likely to be tightly coupled to each other (compared to, say, ARM assembly) and thus requiring stalls of the pipeline to prevent race conditions.



Nycturne said:


> On this note, I think GCD is about to be handed its hat. Swift concurrency, while a different conceptual model, could very well be considered the spiritual successor to GCD. Improvements such as addressing the problem of thread explosions, less thread blocking and the use of actors to control simultaneous data access. And with Swift 5.6 they are making clear steps towards being able to catch concurrency issues at compile time, such as catching sending of data between tasks in an unsafe way. It’s different, but it does close some of the gaps left by GCD, and bakes it right into the language and runtime.
> 
> The biggest hurdle for me getting used to it is moving out of the mindset of using a serial queue for data access (something you still do for CoreData, unfortunately), and getting used to actors.



I still don't know how to do something akin to a serial queue in Swift concurrency when I want something to be FIFO. AFAIK Swift actors prevent data corruption (reading while writing, for example), but don't guarantee FIFO, so you can still have other kinds of race conditions that are easily solved with a plain ol' serial queue (which guarantees FIFO).


----------



## Cmaier

Andropov said:


> Yup, that was my understanding as well. I remember reading somewhere that OoOE is still more difficult on x86 though, since when x86 instructions are 'broken down' to simpler instructions those resulting instructions are more likely to be tightly coupled to each other (compared to, say, ARM assembly) and thus requiring stalls of the pipeline to prevent race conditions.
> 
> 
> I still don't know how to do something akin to a serial queue in Swift concurrency when I want something to be FIFO. AFAIK Swift actors prevent data corruption (reading while writing, for example), but don't guarantee FIFO, so you can still have other kinds of race conditions that are easily solved with a plain ol' serial queue (which guarantees FIFO).



Yes, I designed the reorder hardware for x86 chips, so they definitely do out of order execution. And it is a pain. The issue is that if you have a context switch or interruption you don’t want to have executed a partial architectural instruction (ie you completed three out of six microops corresponding to the architectural instruction). You just need to keep track of stuff in a more complicated way. 

I also designed equivalent hardware on Sparc and it was a lot easier for risc.  

SMT is a layer on top of that.


----------



## Colstan

Since this seems to be "ask @Cmaier questions" day, I've got another bugbear that I've been curious about. I've been asking Dr. Howard Oakley about where he expects Apple to take security in future versions of macOS, either on the hardware or software side. While obviously he doesn't have a crystal ball, he's written extensively about the kextpocalypse and further clamping down in that area. So, since we are about five weeks away from WWDC, I'm wondering if @Cmaier or anyone else here has any thoughts about how Apple is going to evolve security with the Mac and macOS?


----------



## Cmaier

Colstan said:


> Since this seems to be "ask @Cmaier questions" day, I've got another bugbear that I've been curious about. I've been asking Dr. Howard Oakley about where he expects Apple to take security in future versions of macOS, either on the hardware or software side. While obviously he doesn't have a crystal ball, he's written extensively about the kextpocalypse and further clamping down in that area. So, since we are about five weeks away from WWDC, I'm wondering if @Cmaier or anyone else here has any thoughts about how Apple is going to evolve security with the Mac and macOS?




This is outside my area of expertise. I design hardware and write software, but I don’t write OS’s  

I do expect them to continue to try and move things into the user layer and provide safe interfaces to protected layers, but I think they recognize that mac occupies a different role than iOS, and that macos will never be locked down to that degree.  

That said, governments and courts around the world are doing things that may break the ios security model.  If Apple is forced to make ios “locked down by default, but the user has some way to turn that off,” then it may be that macos and ios *do* end up with very similar security models.


----------



## Andropov

Cmaier said:


> Yep. There are a lot of reasons that it makes sense to have some sort of split between “low end” and “high end.”  Where you draw that split is a decision that needs to take into account both economics and physical practicalities.  I could imagine a world where anything below a MBP doesn’t get ray tracing and anything above does. But I imagine Apple will put it in iPad.  I also imagine Apple is working on making it work even in its VR/AR goggles, so they probably have found a way to get it done without requiring too much in the way of silicon resources.
> But does an IPhone need ray tracing? Probably not any time soon. Would they love to include it an harp on how revolutionary it is? Yep.  Can you do it in a die that meets the power and thermal requirements of an iPhone? I would wager you can.
> 
> To me, really, the wildcard is their VR goggle architecture.  If they think you need it for that, and if they do the rendering on the device itself or on a coupled iPhone, then that will drive what they choose to do.



Software development costs for 3rd parties likely also play a big role here. I can't imagine many developers maintaining separate rendering algorithms for raytracing and non-raytracing devices, if there's enough power there to switch a substantial part of the rendering to be ray-based.

I think the move to raytracing renderers is going to be a transition on its own (for software that uses that), and most companies will probably go the least common denominator route. The sooner all supported devices have meaningful raytracing support, the sooner a lot of old and complex code can be phased out completely. Many things are easier or even ‘free’ using rays, where the traditional shading approach is extremely hard to get right. I’m thinking dynamic shadows, ambient occlusion, soft shadows… But as it is now, developers would need to write the shaders for non-raytracing-capable devices anyway, fine-tuning it so it looks right and artifacts are minimised, and then write a totally different shader for raytracing-capable devices. Devices which could run the traditional shader just as well. I can’t see many product managers choosing that path, even if the raytracing shaders are easier to implement.

For other parts of the rendering pipeline like realistic reflections on reflective/mirror-like surfaces, sure, raytracing will probably come sooner in any case. You can just have those surfaces not reflect light realistically on non-raytracing devices, and write a single shader pass (that is simply not executed on older devices). It’s easier to justify devoting developer resources to an extra render pass for devices that support it than a rewrite of a render pass that already works to get an improvement on some devices. But IMHO those things are generally less impactful on the perceived realism of the image.

So even if iPhone does not need raytracing *now*, it’d be nice if 5 years from now all supported iOS and macOS devices had raytracing. It’d make supporting older devices less of a pain in the future.


----------



## Citysnaps

Andropov said:


> So even if iPhone does not need raytracing *now*, i




I could see that being interesting now, especially if Apple's upcoming AR/VR device (I'm betting/hoping it'll be glasses) off loads all of the processing to a user's iPhone.

Since the majority of users will already have an iPhone, best to take advantage of the A-series cpu processing and decent battery capacity in the phone.  And keep AR glasses svelte, with a much smaller battery, and just enough silicon to handle a couple of bidirectional video streams - possibly through UWB, which Apple is already familiar with. And an iPhone already has internet connectivity, necessary for accessing information/data for AR uses.

As an aside, I'm much more excited about AR potential than VR. But that's just me.


----------



## Colstan

Cmaier said:


> This is outside my area of expertise. I design hardware and write software, but I don’t write OS’s



I know. I just got the inspiration to ask because of the SMT side-channel attacks that you briefly mentioned.


Cmaier said:


> That said, governments and courts around the world are doing things that may break the ios security model.  If Apple is forced to make ios “locked down by default, but the user has some way to turn that off,” then it may be that macos and ios *do* end up with very similar security models.



At this point, I wouldn't be surprised if the never ending Dutch dating app investigation is going to be Apple's Archduke Franz Ferdinand moment.


----------



## Nycturne

Andropov said:


> I still don't know how to do something akin to a serial queue in Swift concurrency when I want something to be FIFO. AFAIK Swift actors prevent data corruption (reading while writing, for example), but don't guarantee FIFO, so you can still have other kinds of race conditions that are easily solved with a plain ol' serial queue (which guarantees FIFO).




This is true. The runtime doesn't give you a FIFO queue, but I bet you could build one. But I haven't had time to think much on a good signaling mechanism to wake up just the suspended task that's now ready to execute without building _that_ on top of GCD in some way and losing some of the benefits of actors.

But generally my approach to serial queues has been to not _depend_ on the FIFO nature, so actors are generally a pretty good replacement for the majority of my needs. 



Cmaier said:


> I do expect them to continue to try and move things into the user layer and provide safe interfaces to protected layers, but I think they recognize that mac occupies a different role than iOS, and that macos will never be locked down to that degree.




And it's not like Apple's the only one doing this. Linux, Windows and macOS are all going in this direction. Apple's just being aggressive about it to get 3rd parties out of the kernel.


----------



## Andropov

Colstan said:


> At this point, I wouldn't be surprised if the never ending Dutch dating app investigation is going to be Apple's Archduke Franz Ferdinand moment.



It's a bit sad, for all the good thing the EU has brought us, how in matters of technology they sometimes seem to be making laws based in wrong assumptions of how the underlying technology works and what consequences the proposed laws will have. Some laws have had bizarre consequences. A very recent example: a few months ago the requirements on banking authentication got stricter. So now, when I to send my $250 rent payment from the mobile app at the start of the month I need to:
 - Unlock my iPhone [FaceID or device password].
 - Sign in with my government ID + _personal mobile app password _or _balance visualization password._
 - Order the money transfer.
 - Input my _digital signature password._
 - Open a link on a SMS received, which points to a website, where I must input my _balance visualization_ _password_ before a 5 minute timer expires.
 - Return to the app to see if the transaction succeeded.
This absurd process that involves four distinct passwords and a SMS 2FA is now required for every banking operation. Even consulting transactions older than 6 months requires going through all that process. One could argue that making money transfers safer is worth the inconvenience. But at the same time I can go to Amazon and 1-click buy a $3000 TV with my already signed-in account. And requiring so many different and unique passwords is asking for people to either reuse passwords or jot them down in a post-it in front of the computer. Not to mention how the whole system of receiving a random link from a random phone number where you must input one of your passwords is essentially training people to fall for phishing scams. And this is just one example of one of the many, many tech-related things the EU has screwed with misguided laws.


----------



## Cmaier

Colstan said:


> I know. I just got the inspiration to ask because of the SMT side-channel attacks that you briefly mentioned.




I got involved in side channel stuff after retiring from CPU design, so it’s a pet issue of mine.  Work I did was cited against some patent applications involved in a lawsuit. (E.g. page 7 of this thing: https://patentimages.storage.googleapis.com/3b/17/c1/e74b53fb110c5c/US9419790.pdf). 

One of the guys involved in SMT attacks earlier filed suit on ways to mitigate against power rail side channel attacks, and had a patent on a type of circuit I had published about years earlier.  Anyway…


----------



## Cmaier

Speaking of which:









						Apple Silicon chip vulnerability ‘Augury’ surfaces, but researchers aren’t worried yet
					

Researchers have discovered a new flaw in M1 and A14 chips. The Augury Apple Silicon vulnerability has the ability to leak data at rest.




					9to5mac.com


----------



## Andropov

Nycturne said:


> This is true. The runtime doesn't give you a FIFO queue, but I bet you could build one. But I haven't had time to think much on a good signaling mechanism to wake up just the suspended task that's now ready to execute without building _that_ on top of GCD in some way and losing some of the benefits of actors.
> 
> But generally my approach to serial queues has been to not _depend_ on the FIFO nature, so actors are generally a pretty good replacement for the majority of my needs.



When I last encountered this problem I ended up throwing a NSLock in to make it serial without GCD . Obviously throwing a lock in it defeated the purpose of using actors in the first place.

My problem was that I needed to check whether a parameter had changed after a long-running operation had finished, and store the result if it hadn't or discard it if it had. But while the actor prevented me from reading the parameter itself while other thread was writing it, nothing prevented another thread from changing the parameter *between* the check and the result store, which would result in the actor having a wrong value for the parameter (only the read/write to actor properties is serialized, but many instances of an actor method can be running simultaneously from different threads at the same time).

A serial queue would have trivially solved this, since the store would be guaranteed to run after the check, with no other thread able to write to it (just like the lock did). But I don't know what the swifty way would be when using actors.


----------



## Colstan

Andropov said:


> It's a bit sad, for all the good thing the EU has brought us, how in matters of technology they sometimes seem to be making laws based in wrong assumptions of how the underlying technology works and what consequences the proposed laws will have.



We've all seen that the EU has voted to make USB-C a continental standard. This is a terrible idea, because they already tried that with micro-USB, but it failed to gain traction. Imagine Apple goes wireless with the iPhone, but users in the EU still have a useless USB-C port on their smartphones because some genius government bureaucrat decided that USB-C shall be the eternal connection standard. Microsoft has Windows N just for the European market, the one with less functionality. Now there's a good chance that smartphones will have special European editions.


Cmaier said:


> Speaking of which:
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Apple Silicon chip vulnerability ‘Augury’ surfaces, but researchers aren’t worried yet
> 
> 
> Researchers have discovered a new flaw in M1 and A14 chips. The Augury Apple Silicon vulnerability has the ability to leak data at rest.
> 
> 
> 
> 
> 9to5mac.com



As one researcher said, it's the "weakest DMP an attacker can get". Hopefully, it stays that way, and Apple finds a way to mitigate it, anyway.

My favorite useless security research is using hardware LEDs to steal sensitive information, because that's definitely the most efficient way to hack a system.


----------



## Andropov

Cmaier said:


> Speaking of which:
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Apple Silicon chip vulnerability ‘Augury’ surfaces, but researchers aren’t worried yet
> 
> 
> Researchers have discovered a new flaw in M1 and A14 chips. The Augury Apple Silicon vulnerability has the ability to leak data at rest.
> 
> 
> 
> 
> 9to5mac.com



Ooh. That's interesting. Here's the paper, too: https://www.prefetchers.info/augury.pdf

I'll give it a read tonight. Skimming through it, this stood out:


> _C. What function of memory values is transmitted?
> The M1 AoP DMP makes prefetches based on memory content as if it were pointer values. This, naively, places a major restriction on the function of values transmitted. Only the top 57 bits of the address/value (i.e., L2 cacheline granularity) is transmitted, and only if they are a valid virtual address. As pointers must be placed at 8-byte alignments (Section VI-A), we cannot read partial values. (Section VI-A4)._



That seems to imply that it'd be impossible to retrieve the bottom 8 bits of the target value, right? I imagine that makes the retrieved value much less useful, as you'd have 256 possibilities for every leaked byte you try to reconstruct.


----------



## Cmaier

Andropov said:


> Ooh. That's interesting. Here's the paper, too: https://www.prefetchers.info/augury.pdf
> 
> I'll give it a read tonight. Skimming through it, this stood out:
> 
> That seems to imply that it'd be impossible to retrieve the bottom 8 bits of the target value, right? I imagine that makes the retrieved value much less useful, as you'd have 256 possibilities for every leaked byte you try to reconstruct.




That’s the way i read that, but I haven’t read the full paper yet.


----------



## Nycturne

Andropov said:


> (only the read/write to actor properties is serialized, but many instances of an actor method can be running simultaneously from different threads at the same time).




I don't believe this is entirely accurate (but also not entirely wrong). Functions that access state also pick up on the isolation and become isolated themselves. However, they are also _reentrant_, meaning that if an isolated function suspends during execution, then yes, the actor's state can be mutated underneath it during that suspension point by another access to the actor's isolated context.



Cmaier said:


> That’s the way i read that, but I haven’t read the full paper yet.




One statement by a researcher seemed to suggest that this could potentially still help break ASLR, but it didn't seem certain.


----------



## Colstan

Hey @Cmaier, didn't you once say that a semiconductor startup that includes ex-Apple employees was stalking you on LinkedIn? Looks like Apple isn't happy about it, and is taking them to court over it. (Poaching employees and stealing trade secrets, not stalking you online, although that would be awesome to see in court.)


----------



## Cmaier

Colstan said:


> Hey @Cmaier, didn't you once say that a semiconductor startup that includes ex-Apple employees was stalking you on LinkedIn? Looks like Apple isn't happy about it, and is taking them to court over it. (Poaching employees and stealing trade secrets, not stalking you online, although that would be awesome to see in court.)




LOL. Saw that. It was nuvia employees who kept looking me up on linkedin, though.


----------



## Andropov

Nycturne said:


> I don't believe this is entirely accurate (but also not entirely wrong). Functions that access state also pick up on the isolation and become isolated themselves. However, they are also _reentrant_, meaning that if an isolated function suspends during execution, then yes, the actor's state can be mutated underneath it during that suspension point by another access to the actor's isolated context.



Oh, is that so? I wasn't sure if that was the case, that's why I threw in the NSLock. If an actor only executes one 'block of code' at a time (between suspension points, I mean) then my lock was unnecessary (there was no suspension point between the check and the change in value). Hmm. Makes more sense if it works that way.

Although if that's the case, then actors are not as good as preventing data contention as some articles made me think they were. If an actor method has a long running code block with no suspension points in-between, it'd block any changes to the state from other threads for a while. It's true that those other threads would be suspended themselves, since actor method calls require async, but I see a potential for trouble here that I have not seen mentioned anywhere. Interesting.


----------



## Nycturne

Andropov said:


> Oh, is that so? I wasn't sure if that was the case, that's why I threw in the NSLock. If an actor only executes one 'block of code' at a time (between suspension points, I mean) then my lock was unnecessary (there was no suspension point between the check and the change in value). Hmm. Makes more sense if it works that way.



Correct, and this behavior is outlined in the Swift evolution proposal. There's a second proposal that also outlines some ways to fine tune the isolation behaviors, such as telling the compiler when it's safe to remove isolation on a computed property that doesn't depend on any isolated state, or to pull another actor into a function's isolation context to reduce suspension points within an isolated function that's orchestrating data between two actors. Both of these are part of Swift 5.5. 



Andropov said:


> Although if that's the case, then actors are not as good as preventing data contention as some articles made me think they were. If an actor method has a long running code block with no suspension points in-between, it'd block any changes to the state from other threads for a while. It's true that those other threads would be suspended themselves, since actor method calls require async, but I see a potential for trouble here that I have not seen mentioned anywhere. Interesting.



This doesn't seem any worse than async calls into a serial queue in terms of contention. You execute a long-running block on a serial queue, that queue will back up. You execute a long-running function on an actor, waiting tasks will back up. Note that I say _waiting tasks_, not threads. The difference is important in that waiting tasks do not need to block a thread to wait, and those threads are free to execute other tasks (i.e. no thread explosion).

Maybe I'm missing something?


----------



## Andropov

Nycturne said:


> Maybe I'm missing something?



Nope, you're right. Even that worst-case-scenario (long running actor method with no suspension points) would be no worse than a serial dispatch queue. In fact, it'd be potentially much better, since it's mandatory to be async so callers are not blocked. And realistically in a lot of cases the long running part would be offloaded to an async function, or at least broken down at some points, where other calls would take over. It might even be possible to prioritise high QoS calls when the current executing task suspends, something that's impossible with a serial queue. So there are actually many ways in which the actor model is better. Thinking about it, it's a very clever solution.

My only concern (what motivated my previous post) is that it could be more error prone. It seems to me that's it's easier to mistakenly write this:


		Swift:
	

actor FooActor {
    var fooState: Int = 0
  
    func fooMethod() {
        longRunningThing()
        fooState += 1
    }
}


And forget that the method is going to run serially and block all other calls to the actor during its execution, than to mistakenly do this:


		Swift:
	

let dispatchQueueForIO = DispatchQueue(label: "Serial Queue for I/O")

dispatchQueueForIO.async {
    longRunningThing()
    fooSeriallyAccessedOnlyProperty += 1
}


And forget that it's running on the serial queue. Ideally, both cases should move the longRunningThing() outside the method/async block. But in the latter is extremely clear that the context is a serial queue, so the error is more evident. More verbose, in a way. That may be because I'm less used to working with actors, admittedly.


----------



## Nycturne

Andropov said:


> Ideally, both cases should move the longRunningThing() outside the method/async block. But in the latter is extremely clear that the context is a serial queue, so the error is more evident. More verbose, in a way. That may be because I'm less used to working with actors, admittedly.




Agreed, there's a lot more implicit in the Swift concurrency model.


----------



## Cmaier

Andropov said:


> Nope, you're right. Even that worst-case-scenario (long running actor method with no suspension points) would be no worse than a serial dispatch queue. In fact, it'd be potentially much better, since it's mandatory to be async so callers are not blocked. And realistically in a lot of cases the long running part would be offloaded to an async function, or at least broken down at some points, where other calls would take over. It might even be possible to prioritise high QoS calls when the current executing task suspends, something that's impossible with a serial queue. So there are actually many ways in which the actor model is better. Thinking about it, it's a very clever solution.
> 
> My only concern (what motivated my previous post) is that it could be more error prone. It seems to me that's it's easier to mistakenly write this:
> 
> 
> Swift:
> 
> 
> actor FooActor {
> var fooState: Int = 0
> 
> func fooMethod() {
> longRunningThing()
> fooState += 1
> }
> }
> 
> 
> And forget that the method is going to run serially and block all other calls to the actor during its execution, than to mistakenly do this:
> 
> 
> Swift:
> 
> 
> let dispatchQueueForIO = DispatchQueue(label: "Serial Queue for I/O")
> 
> dispatchQueueForIO.async {
> longRunningThing()
> fooSeriallyAccessedOnlyProperty += 1
> }
> 
> 
> And forget that it's running on the serial queue. Ideally, both cases should move the longRunningThing() outside the method/async block. But in the latter is extremely clear that the context is a serial queue, so the error is more evident. More verbose, in a way. That may be because I'm less used to working with actors, admittedly.




I don’t get actors. I should learn actors. I get queues. I can picture the dispatcher in my mind. I am old.

Implicit stuff also hurts me.


----------



## Yoused

Andropov said:


> Ooh. That's interesting. Here's the paper, too: https://www.prefetchers.info/augury.pdf



I got partway through it (not too hard to skim) when this kind of jumped out at me



> A major complication for reverse engineering is reports that the M1 DRAM controller performs frequency scaling [34]. This matches our observations that a cache miss to DRAM can return in a wide range of times. We find that increasing the pressure on DRAM can reduce the average access time more than amortizing measurement costs would anticipate. The net effect is that we observe otherwise inexplicable decreases in memory access times for longer experiments.




It amost sounds like just adding the right amount of noise to timing cycles might be enough to mitigate some/many of these arcane vunerabilities.


----------



## Cmaier

Yoused said:


> I got partway through it (not too hard to skim) when this kind of jumped out at me
> 
> 
> 
> It amost sounds like just adding the right amount of noise to timing cycles might be enough to mitigate some/many of these arcane vunerabilities.




Any kind of random process can be an effective countermeasure to most side channel attacks.  For differential power analysis, you can inject random noise on the power rails, for example.


----------



## Colstan

I didn't think this deserved it's own topic, but Intel is allegedly farming out partial Meteor Lake production to TSMC on their 5nm process. Keep in mind this is Digitimes we are talking about, but we've heard rumblings about this before, and Intel is using 6nm for GPUs. Assuming this is correct, Apple will already be on 3nm, so even Intel can't bribe TSMC enough. It does make me wonder if this is a good idea long-term for TSMC, since Intel is gunning to be a direct competitor in the foundry business, but competitors do business all the time, such as Apple does with Samsung.

Thus far, this doesn't seem like a move of strength from Gelsinger. Sure, he was dealt a bad hand, but I don't see this inspiring a lot of confidence inside Intel. Anyone have any thoughts on his leadership, thus far? I remember back when Mike Magee would write about him in The Register. He called him "Kicking" Pat Gelsinger. Magee claimed that he met with Intel executives during the Itanium launch. Gelsinger said that every major OEM had Itanium systems lined up. When Magee pointed out that Gateway didn't have an Itanium offering, Pat Gelsinger got upset and proceeded to kick Magee, legally committing assault, hence the nickname. Maybe he should save the kicks for Intel's engineers. (Figuratively, not literally.)


----------



## Cmaier

Colstan said:


> I didn't think this deserved it's own topic, but Intel is allegedly farming out partial Meteor Lake production to TSMC on their 5nm process. Keep in mind this is Digitimes we are talking about, but we've heard rumblings about this before, and Intel is using 6nm for GPUs. Assuming this is correct, Apple will already be on 3nm, so even Intel can't bribe TSMC enough. It does make me wonder if this is a good idea long-term for TSMC, since Intel is gunning to be a direct competitor in the foundry business, but competitors do business all the time, such as Apple does with Samsung.
> 
> Thus far, this doesn't seem like a move of strength from Gelsinger. Sure, he was dealt a bad hand, but I don't see this inspiring a lot of confidence inside Intel. Anyone have any thoughts on his leadership, thus far? I remember back when Mike Magee would write about him in The Register. He called him "Kicking" Pat Gelsinger. Magee claimed that he met with Intel executives during the Itanium launch. Gelsinger said that every major OEM had Itanium systems lined up. When Magee pointed out that Gateway didn't have an Itanium offering, Pat Gelsinger got upset and proceeded to kick Magee, legally committing assault, hence the nickname. Maybe he should save the kicks for Intel's engineers. (Figuratively, not literally.)



 All I know is Gelsinger was busy the last bunch of years trying to convert people in Silicon Valley to Christianity.  Tells me all I need to know about him.

As for TSMC, I‘ll note that from my experience they must be being careful.  Obviously they will have to provide Intel with design rules and a design kit (caliber decks, cell libraries, etc.), which would be of at least a small use to Intel if Intel wanted to leverage them to figure out how to make its own process better.  But I’m sure TSMC is contractually making Intel set up a wall between its fab folks and design folks with respect to all this information, including audit rights, etc.  

TSMC is very paranoid (rightfully so) - I once had to see their detailed process flows, and I had to do it in a glass room, where all I was allowed to bring in was pencil and paper, and I was observed the whole time.  All I was allowed to write down was a list of things I wanted more information about, and I had to give the list to them when I left.  And at the time I was indirectly working on their behalf.   Of course Intel will never get to see anything so detailed.

Which brings me to the second point - practically speaking, there‘s nothing that Intel will gain access to that they wouldn’t already know.  They doubtless have analyzed the heck out of apple’s chips, analyzed doping concentrations, looked at things with STEMs, etc.  They can almost certainly figure out exactly what TSMC is doing, down to each process step.  So the risk to TSMC is low from that respect.  

And TSMC may be figuring that it can win *all* of Intel’s business in the distant future, at least for the high end stuff. 

On the other hand, TSMC, by doing this, is allowing Intel to sell more product which brings it revenue it can use to eventually compete with TSMC, as well as provides Intel’s fab team with motivation to get their act together. 

I’m sure TSMC is thinking very carefully about what to do here.


----------



## Yoused

I was browsing around on the subject of Power10, which seems to look substantially better than x86 for some server implementations and skimmed an old Register piece that included,

*An easier-to-sell feature: so-called "transparent" memory encryption. "What is great about this is it is encrypting information transparently without any performance overhead of the system," claimed Boday. "It's done through the hardware. And so we can actually scale this encryption to very large memory databases.

"As the information is encrypted, you [can] continue to do computational workload on it and not unencrypting it, with fully homomorphic encryption. This is all achieved through our 2.5x faster [AES cryptography] performance per core."*​
This is interesting. Being able to work on encrypted data without knowing what it actually is. It has a lot of appeal. I wonder if Apple could develop that.


----------



## Cmaier

Yoused said:


> I was browsing around on the subject of Power10, which seems to look substantially better than x86 for some server implementations and skimmed an old Register piece that included,
> 
> *An easier-to-sell feature: so-called "transparent" memory encryption. "What is great about this is it is encrypting information transparently without any performance overhead of the system," claimed Boday. "It's done through the hardware. And so we can actually scale this encryption to very large memory databases.*​​*"As the information is encrypted, you [can] continue to do computational workload on it and not unencrypting it, with fully homomorphic encryption. This is all achieved through our 2.5x faster [AES cryptography] performance per core."*​
> This is interesting. Being able to work on encrypted data without knowing what it actually is. It has a lot of appeal. I wonder if Apple could develop that.




I’m not aware of any type of homomorphic encryption which allows arbitrary operations.  There’s always some subset of operations that can be done.  This is probably not that useful except in certain narrow use cases.


----------



## Cmaier

It gets so boring being omniscient.









						Qualcomm CEO Admits Nuvia Chip OEM Sampling is Delayed (Update)
					

Sampling target shifts from August 2022 to sometime in 2023.




					www.tomshardware.com


----------



## Colstan

Cmaier said:


> It gets so boring being omniscient.



From what I've heard, assuming the leaks are accurate, Zen 4 was delayed by a year. It was supposed to be out Q4 of last year. Raptor Lake, Meteor Lake, and Arrow Lake are also behind schedule. In fact, Arrow Lake was supposed to compete with Zen 4, and instead will have to contend with Zen 5. Intel was expecting a route, but they're behind again, as is tradition. Same goes for their Arc series of graphics cards, which nobody will give a damn about by the time they are released. Looks like Qualcomm is in on the fun, too.

Please correct me if I am wrong, but Apple appears to be on schedule, regarding the M-series cadence. They've delayed some Mac releases, but those are due to shortages in commodity products, and the primary silicon involved has been delivered as expected.


----------



## Cmaier

Colstan said:


> From what I've heard, assuming the leaks are accurate, Zen 4 was delayed by a year. It was supposed to be out Q4 of last year. Raptor Lake, Meteor Lake, and Arrow Lake are also behind schedule. In fact, Arrow Lake was supposed to compete with Zen 4, and instead will have to contend with Zen 5. Intel was expecting a route, but they're behind again, as is tradition. Same goes for their Arc series of graphics cards, which nobody will give a damn about by the time they are released. Looks like Qualcomm is in on the fun, too.
> 
> Please correct me if I am wrong, but Apple appears to be on schedule, regarding the M-series cadence. They've delayed some Mac releases, but those are due to shortages in commodity products, and the primary silicon involved has been delivered as expected.



The chip cadence appears to be on track. The only question mark is the chip for the Mac Pro, which may or may not be behind.


----------



## jbailey

Cmaier said:


> It gets so boring being omniscient.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Qualcomm CEO Admits Nuvia Chip OEM Sampling is Delayed (Update)
> 
> 
> Sampling target shifts from August 2022 to sometime in 2023.
> 
> 
> 
> 
> www.tomshardware.com



Now the Nuvia Snapdragon will be competitive with the M2? In 2024? Sounds great.


----------



## Yoused

jbailey said:


> Now the Nuvia Snapdragon will be competitive with the M2? In 2024? Sounds great.



The Snapdragon 888 is showing GB5 SC scores comparable to an A12X, MC somewhat lower, a 3-y/o SoC. Granted, the A12X is listed "_up to_ 2.49 GHz" while the 888 score is listed 1.8 GHz, so there is that. If you truly believe that Nuvia will be less than a year behind AS, well, think again.


----------



## Cmaier

Yoused said:


> The Snapdragon 888 is showing GB5 SC scores comparable to an A12X, MC somewhat lower, a 3-y/o SoC. Granted, the A12X is listed "_up to_ 2.49 GHz" while the 888 score is listed 1.8 GHz, so there is that. If you truly believe that Nuvia will be less than a year behind AS, well, think again.




The internet: Apple is doomed because the Nuvia guys all left, and therefore M2 is bad and Apple’s progress has stalled.
Qualcomm: the nuvia guys who have been here for a year need two more years to get a chip done.


----------



## Colstan

Cmaier said:


> The internet: Apple is doomed because the Nuvia guys all left, and therefore M2 is bad and Apple’s progress has stalled.
> Qualcomm: the nuvia guys who have been here for a year need two more years to get a chip done.



Correct me if I'm wrong, but wasn't Nuvia like three guys? How many engineers, roughly speaking, are working on Apple Silicon? From my purely amateur observations, these are large projects, where any one individual, or handful of individuals, aren't vital to a design implementation. (Except for Jim Keller, because the internet says he is chip god.)


----------



## Citysnaps

Cmaier said:


> The internet: Apple is doomed because the Nuvia guys all left, and therefore M2 is bad and Apple’s progress has stalled.
> Qualcomm: the nuvia guys who have been here for a year need two more years to get a chip done.




I might be wrong, but my sense is Qualcomm's glory days have long passed.  They have a ton of communications-related patents having written the book on modern communications signal processing and error correction, and no doubt still snag decent licensing fees.  But it has been a loooong time since Drs. Viterbi and Jacobs, the real brain trust, ran the company.


----------



## Cmaier

Colstan said:


> Correct me if I'm wrong, but wasn't Nuvia like three guys? How many engineers, roughly speaking, are working on Apple Silicon? From my purely amateur observations, these are large projects, where any one individual, or handful of individuals, aren't vital to a design implementation. (Except for Jim Keller, because the internet says he is chip god.)



there were three main guys, but I’ve seen discussion of a couple dozen or more lower-level people having gone. One of the three guys I know really well having worked alongside him for years. I read very interesting things about him in Apple’s legal complaint re: trade secrets


----------



## Cmaier

citypix said:


> I might be wrong, but my sense is Qualcomm's glory days have long passed.  They have a ton of communications-related patents having written the book on modern communications signal processing and error correction, and no doubt still snag decent licensing fees.  But it has been a loooong time since Drs. Viterbi and Jacobs, the real brain trust, ran the company.




I have no idea. I think that if there’s a market for “Intel, but for Arm” then they are well-situated.  I just don’t think there will be another “intel” model in the semiconductor industry.  Many of the biggest companies that sell end products (cars, computers, phones) will do their own CPU designs and use TSMC or Samsung or Intel to fab them.  The smaller companies will be customers, but Qualcomm will have to compete with other Arm chip designers (Marvel, Samsung, Intel, maybe AMD someday, nVidia, etc.).   Qualcomm still has a stranglehold on radio chipsets, though there are signs of small cracks in that, too.   

Qualcomm has at least been smart enough to realize that it needs to leverage its radio stranglehold to get into other markets before it’s too late.


----------



## Colstan

Cmaier said:


> there were three main guys, but I’ve seen discussion of a couple dozen or more lower-level people having gone.



So, aside from the underlings, and your trade secrets friend, most/many of the folks still at Apple are top-shelf engineers?


----------



## Citysnaps

Cmaier said:


> I have no idea. I think that if there’s a market for “Intel, but for Arm” then they are well-situated.  I just don’t think there will be another “intel” model in the semiconductor industry.  Many of the biggest companies that sell end products (cars, computers, phones) will do their own CPU designs and use TSMC or Samsung or Intel to fab them.  The smaller companies will be customers, but Qualcomm will have to compete with other Arm chip designers (Marvel, Samsung, Intel, maybe AMD someday, nVidia, etc.).   Qualcomm still has a stranglehold on radio chipsets, though there are signs of small cracks in that, too.
> 
> Qualcomm has at least been smart enough to realize that it needs to leverage its radio stranglehold to get into other markets before it’s too late.




What I'm wondering, being out of the digital communications signal processing ASIC field for awhile, is who is pioneering developments in that area. Or is that handled within much larger companies like Apple and Samsung (or Ericsson/Nokia) and you just don't hear about interesting breakthroughs in that area as it's embedded in their tech. We had to tread very lightly on some aspects such as error correction techniques and power amplifier digital pre-distortion and linearization knowing that it would be easy for a large company to squash a 10 person company.


----------



## Cmaier

Colstan said:


> So, aside from the underlings, and your trade secrets friend, most/many of the folks still at Apple are top-shelf engineers?




I would think so. I see 10 folks on my linkedin connections who I used to work with who are at Apple.  Director of Custom Silicon Management - this guy was an incredibly talented circuit designer who worked at DEC on the Alpha before i worked with him for years.  Senior CAD engineer - became my boss for a few minutes and took over most of my work when I left.   Senior CAD engineer - design verification manager. I worked with this guy at Exponential, when there were two engineers who had to do the verification for the whole chip.  I worked with him to debug a nasty wire coupling bug, and I remember printing out giant schematics and manually working out how to generate a test vector to determine whether our theory was right.  Another engineer who worked for me on several chips.  A power optimization lead. I worked with him all the way back on K6, and then he moved to Austin. Very smart guy, though I recall a fun story where we were at a conference together, and another guy we know was talking to me.  The guy who is now at Apple came up and offered his opinion on a technical matter, and the other guy looks at his badge, doesn’t see a “Dr.” In front of the name, and says ”well, I have a ph.d. to back up MY opinion.”  Fun.

Oh, this is fun! I see the *other* of the two design verification guys from Exponential is *also* at Apple!

Ah, another guy who lists his job title as “CAD monkey.” He took over my EDA tools when I left - so it was left to him to figure out my spaghetti code that performed circuit classification.  I didn’t work too much with him, but if he figured out how to keep my code working he must be talented.

CPU Implementation Lead.  He started at AMD a few years before me, and was one of the first people I saw go to Apple. He’s been there a very long time.

Anyway, these are all very experienced folks who know what they are doing.


----------



## Cmaier

citypix said:


> What I'm wondering, being out of the digital communications signal processing ASIC field for awhile, is who is pioneering developments in that area. Or is that handled within much larger companies like Apple and Samsung (or Ericsson/Nokia) and you just don't hear about interesting breakthroughs in that area as it's embedded in their tech. We had to tread very lightly on some aspects such as error correction techniques and power amplifier digital pre-distortion and linearization knowing that it would be easy for a large company to squash a 10 person company.




yeah, i think that progress is pretty spread around right now. Convolutional coding and viterbi coding was a breakthrough, and then along came some MIMO advancements, and most of the time things are just incremental.  I’m pretty sure qualcomm still has a huge concentration of the engineers who come up with the stuff that ends up in the standards, but if you look at the standards declarations, lots of companies have engineers who contribute.  I know that Apple has been accelerating that area rapidly, but it’s still nowhere near qualcomm.


----------



## Cmaier

Ah, interesting. I did a linked in filter on Nuvia.  So aside from their VP of engineering (at qualcomm), who is the trade secrets guy I mentioned, another guy who was essentially a peer of mine at AMD works there (as a member of technical staff, which I would imagine is not where he’d want to be at this point in his career).  He was pretty good. At least as good as the VP, by my recollection.  But I’d much rather have the folks I listed from Apple.

Keep in mind, it’s been many years since I worked with these people. Maybe the weaker ones got strong and the stronger ones had head injuries. I don’t know.


----------



## Citysnaps

Cmaier said:


> yeah, i think that progress is pretty spread around right now. Convolutional coding and viterbi coding was a breakthrough, and then along came some MIMO advancements, and most of the time things are just incremental.  I’m pretty sure qualcomm still has a huge concentration of the engineers who come up with the stuff that ends up in the standards, but if you look at the standards declarations, lots of companies have engineers who contribute.  I know that Apple has been accelerating that area rapidly, but it’s still nowhere near qualcomm.




There was a company in San Jose called ArrayComm that pioneered a lot beamforming and MIMO advancements. Interestingly it was founded by Martin Cooper, who is credited with coming up with cell-based wireless telephony at Motorola long ago. We talked with them at one time about collaboration possibilities developing digital beamforming ASICs for telecom and other applications. The good news was that didn't go anywhere as there was a principle engineer there who some worked with at a previous company that would have been difficult to get along with.

We also had extensive discussions with Altera and Xilinx who wanted to get in the digital communications processing space for commercial and defense applications.  They were stymied trying to implement digital filters, down converters/upconverters for radios and couldn't figure out why their FPGA designs were much more than an order of magnitude less in performance than our ASICS. And FAR worse on power dissipation.   We passed on working with them knowing they just wanted to snag our architecture tricks.  

We did collaborate with National, them having some interesting high performanc/speed/high bit width ADCs and DACs, but didn't have the the digital communication/radio signal processing chops, or the customer breadth.

We finally landed at TI, with pros and cons.


----------



## Colstan

Cmaier said:


> Oh, this is fun! I see the *other* of the two design verification guys from Exponential is *also* at Apple!



I'm glad to see that you're enthused about your old colleagues. I find these stories to be quite fascinating. From the outside, there wasn't a lot of information in the general tech press about many of the companies that you've mentioned or worked at. NexGen just appeared out of nowhere, and was bought by AMD. Cyrix was always puttering around the mid-card, Centaur lower than that, until VIA bought both and everyone left. I recall Exponential being a thing that existed, but for the life of me I couldn't tell you what you were working on. Transmeta was a bizarre little endeavor.

Like Grendel from the Anglo-Saxon epic Beowulf, the DEC Alpha was much talked about, but rarely seen.

No offense about your work on the K6, but I didn't pick up my first AMD CPU until the Athlon XP. I fried it with too much voltage over a series of months, then replaced it with a Northwood P4, because you could get rediculous clocks of of them.

That was back when I was young and stupid and only cared about overclocking and how many FPS I could get out of Quake, even though I didn't play Quake. Now that I've grown to be old and stupid, I've decided to let Apple handle all of the details for me. Last year, I upgraded the system memory inside my Mac mini from 8GB to 64GB, and found it to be a tedious, laborious process. I'd rather just buy the whole widget, not having to worry about BIOS settings, activating Windows, anti-virus, privacy invading bloatware, or any of the other PC shit, and just let Apple do it for me. The Mac is a superior product on most every level, in my opinion, so it's an easy choice.

Still, I appreciate your smoke pit stories, @Cmaier.


----------



## Cmaier

Colstan said:


> I'm glad to see that you're enthused about your old colleagues. I find these stories to be quite fascinating. From the outside, there wasn't a lot of information in the general tech press about many of the companies that you've mentioned or worked at. NexGen just appeared out of nowhere, and was bought by AMD. Cyrix was always puttering around the mid-card, Centaur lower than that, until VIA bought both and everyone left. I recall Exponential being a thing that existed, but for the life of me I couldn't tell you what you were working on. Transmeta was a bizarre little endeavor.
> 
> Like Grendel from the Anglo-Saxon epic Beowulf, the DEC Alpha was much talked about, but rarely seen.
> 
> No offense about your work on the K6, but I didn't pick up my first AMD CPU until the Athlon XP. I fried it with too much voltage over a series of months, then replaced it with a Northwood P4, because you could get rediculous clocks of of them.
> 
> That was back when I was young and stupid and only cared about overclocking and how many FPS I could get out of Quake, even though I didn't play Quake. Now that I've grown to be old and stupid, I've decided to let Apple handle all of the details for me. Last year, I upgraded the system memory inside my Mac mini from 8GB to 64GB, and found it to be a tedious, laborious process. I'd rather just buy the whole widget, not having to worry about BIOS settings, activating Windows, anti-virus, privacy invading bloatware, or any of the other PC shit, and just let Apple do it for me. The Mac is a superior product on most every level, in my opinion, so it's an easy choice.
> 
> Still, I appreciate your smoke pit stories, @Cmaier.



This is what we did at Exponential. In the photo in the bio section I appear to be 12 years old.


----------



## Colstan

Cmaier said:


> This is what we did at Exponential. In the photo in the bio section I appear to be 12 years old.



Thanks for the document, taking a gander at it, some of that is coming back to me. And yeah, you're just a youngin' in that photograph.

Aside from the minutia of semiconductor design, I think these stories are an important reminder that humans are responsible for them. To the general public, computer chips are magic, Wifi is wizardy, the internet is illusory. It's been trendy, in recent times, for futurists and science fiction authors to use this perception to their advantage. One supposed explanation for the Fermi Paradox is the artificial intelligence singularity. In other words, a sufficiently advanced civilization will be undone by its own creations. It's as if the end result will be a series of self-replicating Von Neumannn Probes that will efficiently turn the Earth, and hence us, into an endless supply of paperclips.

What's lost in all of this is that computer chips are designed by humans. AI software is written by humans. It's not going to be like some bizarre intelligence visiting from Proxima-Centauri b, but will have the fingerprints of humanity all over it.

Sorry about the apocalyptic tangent, but your tales are valuable, if for no other reason as a reminder that technology is very much a human endeavor.


----------

