macOS 26.2 adds Infiniband over Thunderbolt support

Which brings me to my point: Siri is being set up to fly for users. That Apple can basically pay whoever it thinks can produce a great model for them to their specific request, and then load it on servers that sip a fraction of the energy (500 Watts for this 4 Mac supercomputer vs 2.4 kilowatts for 4 H100s), means not only do they not need to resort to funding nuclear energy companies, but they can run Siri for low cost, which means they can continue to offer Siri for free.

That Apple can offer a personalized pocket assistant for free, and its promised feature set is being able to ask it to perform actions and get personal information of you and your friends/family you want to remember is a significant, actual competitive advantage.

Too bad that doesn't play into people's made up narrative of being behind in "AI."

I don't know, but I would be surprised if Apple didn't request (if they chose to go third party) that the model be encoded with ATSC in mind, which I believe is an industry first for widely deployed transformer models to consumers. Encoding weights with ATSC means they can shrink the bits down, and it's only possible because Apple's GPUs were designed with ATSC in mind. So while I don't know for sure if they will, Apple has the ability to get a far larger, more capable model into memory without needing more actual memory. This is significant given the nearing memory price increases for the industry, and for energy costs. It means Apple will be able to provide Siri to all users with Apple Intelligence enabled devices for free.

Apple's PCC model already does this with ATSC, and from recent reports on social media, PCC model seems to be upgraded significantly with 26.2, producing far faster, more accurate responses.

Apple is behind in AI my ass lol.
 
Another interesting highlight in a YouTube comment.


Based off of Apple's Thunderbolt over RDMA, Apple is able to achieve 28 tokens per second on a 1 trillion parameter. It drew an average of 500 watts during inference. This is 0.000021 dollars per second of electricity it uses. For $.15 per kWh avg

That's pretty incredible given you need to pay $21 per million tokens for input and $168 per million tokens for output on other models in the cloud,

Simplistically for a 2 million total token group of interactions (1 million each), you're spending $189.
At that same level, you're paying $1.43 in electricity costs for mac. If you factor in the set up, it's around $10 with 24/7 usage over 3 years of ownership.

Yes, it takes far longer (about 10 hours across 2 million token interaction... actually apparently not. 5.2 Pro runs at 25 tokens per second, which ironically makes it slower...) but the model is completely in your control, you're not locked into a model providers specific rate, and you get privacy and security.

5.2 Pro is ahead slightly in benchmarks, but not nearly enough for it to cost much more


Even against Opus which is $30 for the same scenario, it's not much better in benchmarks to justify the price.

I'm pretty sure with macOS Tahoe 26.2 Apple just annihilated literally everyone lol. If the run times of 25 tokens per second for 5.2 Pro and 42 tokens per second for Opus 4.5 are true, then Apple is literally providing a much better value with Macs vs cloud computing.

You can walk into an Apple Store and buy 4 Macs (or order them). You can't even buy NVIDIA consumer GPUs at this point let alone H200's as a consumer LMFAO. To boot, you can finance those 4 Macs on an Apple Card for 12 months

 
Last edited:
Back
Top