- Joined
- Sep 26, 2021
- Posts
- 8,195
- Main Camera
- Sony
Apple might use Google servers to store data for its upgraded AI Siri
The new Siri may rely on Google’s cloud.
Guess we’ll see…
This is a news item I don’t find outlandish. While Apples private cloud is an amazing initiative, they simply don’t have the hardware to run large models at scale. The difference to modern Nvidia inference boxes is simply too significant.
How, after reading my well-sourced multiple posts on this matter, do you find it "not outlandish?"This is a news item I don’t find outlandish. While Apples private cloud is an amazing initiative, they simply don’t have the hardware to run large models at scale. The difference to modern Nvidia inference boxes is simply too significant.
Hey, Tim. First question is on Google partnership again. I wanted to understand how you came to that decision with regard to the AI and Siri in particular and if there’s an opportunity for you guys to share in revenue too with that partnership like you do in search.
Yeah, we basically determined that Google’s AI technology would provide the most capable foundation for AFM (Apple Foundation Models), and we believe that we can unlock a lot of experiences and innovate in a key way due to the collaboration. We’ll continue to run on the device and run in Private Cloud Compute and maintain our industry-leading privacy standards in doing so. In terms of the arrangement with Google, we’re not releasing the details of that.
"We're not changing our privacy rules," Cook's on-air comment read. "We still have the same architecture that we announced before, which is on device plus Private Cloud Compute."
That's the same "report," by the way, which adds to its nonsense.Contrast with other rumors today that Apple is only using 10% of its AI server capacity.
Contrast with other rumors today that Apple is only using 10% of its AI server capacity.
Second, how can it be both underutilized and underpowered? Something smells there. I expand on this more below in the third point!
1) NVIDIA is cool, but not being considered for PCC so performance regardless of good/bad is irrelevant.Does there have to be a contradiction? To analyze the problem we need to understand a) what is the current and projected compute capacity of the system, b) what is the system currently used for, and c) what will the need be once the new Siri and features launch. We do know quite a lot about PCC architecture, because Apple has published detailed of documentation. What do we know about a) and b)? Not much, to be honest. We know that some Apple LLM requests are routed to PCC (Xcode/text processing), that the cloud model used is not very large by modern standards, and that these features are not used very actively by the users. It is also possible that Siri currently runs on PCC (is it confirmed?), and we also know that current Siri needs less compute than a large LLM. We can at least estimate something about c) — large LLMs require a lot of compute and memory bandwidth, and M2/M3 Ultras (alleged backbone of PCC) have neither. For example, an GH200 offers 4x increase in bandwidth and 10x-40x increase in matrix compute compared to an M3 Ultra. So unless Apple uses modified SoCs that include matrix accelerators (and even if they do), the compute density of PPC is going to be considerably lower than an Nvidia solution. Add to this the relatively low production capacity — Apple needs a lot of time to stockpile the chips and build servers due to manufacturing constraints and costs — no wonder they started with the PPC project way before they needed the compute.
Adding all these factors together, I can totally see how the system would be designed for future need (hence "10% current usage") and yet might fail to achieve the needed capacity at scale, especially now that Appel is pivoting to using Google's foundational models. After all, when PCC project was initiated, they might have been working with different projections, and these are not things one can change overnight.
I don't anyone except employees have an answer, and executives more so at Apple than individual engineers. Tim Cook said they're shipping ahead of schedule. I agree they need to build the capacity, but it seems like they're doing that. They made a cool video showing off PCC servers for the first time including a bit of the assembly.@RockRock8 The question is whether Apple can build enough of these chips to satisfy PCC demands. I don't think that anyone except insiders has an answer.
I think if I read stuff correctly they'll have a on device TM for some stuff, with other tasks offloaded to PCC. But I do believe if you want to know how they coordinate across Ensembles it's in the PCC documentation! It's super cool in my opinionEven an iPhone can do a lot of the Siri back-end without having to pester the server, and the M5 Max seems to be a capable local LM processor. I can imaging Apple might me offloading a lot of this workload to local devices, to minimize its server requirements. I would be curious about how models could exchange information without having to transfer large arrays of tokens.
Yes this is cool, but have you checked out their PCC documentation yet? I feel it's more grounded in what they're doing concretely, and still explains in depth. What do you think?Luckily, Apple has provided quite a lot of info in this patent: https://patentscope.wipo.int/search/en/detail.jsf?docId=US469223044&_cid=P22-MMMP4H-82720-1
Honored to be included in this list but if I'm being honest @mr_roboto and @leman are both way out of my league. (as are many others here) Sadly the full depth of my expertise these days just feels like plugging numbers others have generated into spreadsheets and counting pixels from die shots other people have annotated. The extent of my thoughts on PCC is that I think it's the right approach while useful models can't fit locally - i.e. Apple's approach is the right one here. There's a lot I'd like to learn about that side of things, but I lack the time, or, more accurately, the mental bandwidth to do so.@leman @dada_dave @mr_roboto and anyone else too! I want to hear more about your thoughts about the PCC stuff.
I've compiled the links. If you have the time I'd love to read.
https://www.youtube.com/watch?v=ktFlaBhpMu8
(Has some video of PCC)
https://www.apple.com/newsroom/2026/02/apple-accelerates-us-manufacturing-with-mac-mini-production/
(Has photos and videos of PCC)
(The documentation for PCC)Documentation
security.apple.com
Yes this is cool, but have you checked out their PCC documentation yet? I feel it's more grounded in what they're doing concretely, and still explains in depth. What do you think?
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.