macOS 26.2 adds Infiniband over Thunderbolt support

Been away, just catching up.

I suspect that the next Studio, or perhaps the one after that, will have an option for much faster Ethernet using OSFP or some QSFP variant. 200gbps, probably, but 400 and 100 aren't all that unlikely, and 800 isn't impossible. Look at the nVidia DGX Spark - a significant part of its cost is the inclusion of the ConnectX-7.

I wouldn't have expected it, but it's not impossible that Apple is using nVidia's chip for that purpose right now in their PCC servers. That would explain mellanox drivers, and suggest that we might see that in future Studios. And Pros, if they ever make another, sigh.

However unless you took apart the libmlx5 and can see symbols (or text) confirming that it's mellanox-related, I wouldn't yet take that as proven.
 
Been away, just catching up.

I suspect that the next Studio, or perhaps the one after that, will have an option for much faster Ethernet using OSFP or some QSFP variant. 200gbps, probably, but 400 and 100 aren't all that unlikely, and 800 isn't impossible. Look at the nVidia DGX Spark - a significant part of its cost is the inclusion of the ConnectX-7.

I wouldn't have expected it, but it's not impossible that Apple is using nVidia's chip for that purpose right now in their PCC servers. That would explain mellanox drivers, and suggest that we might see that in future Studios. And Pros, if they ever make another, sigh.

However unless you took apart the libmlx5 and can see symbols (or text) confirming that it's mellanox-related, I wouldn't yet take that as proven.
1) I highly doubt Apple is going to add in what is basically very enterprise ports into this consumer product. It's just not something Apple chases. They are focused on Thunderbolt and for good reason

2) PCC has already been confirmed to be fully Apple silicon, Internal transformer models are trained via Google TPU hardware, primarily, with some NVIDIA.
 
I wouldn't have expected it, but it's not impossible that Apple is using nVidia's chip for that purpose right now in their PCC servers. That would explain mellanox drivers, and suggest that we might see that in future Studios. And Pros, if they ever make another, sigh.

They will most likely use Mellanox switches in their data servers. Thunderbolt is cute, but not really suitable for building a large-scale cloud system.
 
They will most likely use Mellanox switches in their data servers. Thunderbolt is cute, but not really suitable for building a large-scale cloud system.
Did you mean to reply to me? The comment you replied to didn't mention Thunderbolt, but I did.

To be clear, I wasn't claiming Apple was connecting servers via Thunderbolt....
 
Last edited:
1) I highly doubt Apple is going to add in what is basically very enterprise ports into this consumer product. It's just not something Apple chases. They are focused on Thunderbolt and for good reason

2) PCC has already been confirmed to be fully Apple silicon, Internal transformer models are trained via Google TPU hardware, primarily, with some NVIDIA.

Did you mean to reply to me? The comment you replied to didn't mention Thunderbolt, but I did.

To be clear, I wasn't claiming Apple was connecting servers via Thunderbolt....
Your point #2 above suggests you are confused.

I was not talking at all about nVidia's GPUs. I was talking about the ConnectX series, now made by nVidia, which comes from their Mellanox acquisition.

As for your point #1: Apple faces a stark choice, because Thunderbolt simply doesn't have the bandwidth for connecting hosts as fast as you want to for scaling out AI (much less scaling up). TB is great compared to 10GbE, but not to 100, much less the 200 used in the DGX Spark, or the 400/800 used in bigger servers.

Putting that speed Ethernet in the Studio would be utterly impractical for the base model. But there's an easy solution. After all, they already make 10GbE optional. It would not be a stretch to make an OSFP or QSFP port with a really fast Ethernet chip behind it an option as well. And if they are supporting a mellanox chip already for their PCC servers, it's reasonable to use those for Studios as well, though there are also several other options. Perhaps Tim Cook has decided it's time to bury the hatchet?

Coming back to your second comment... If you think they're not connecting them with TB, then what do you think they're using? And why wouldn't they consider that tech for the studio, at least as an option?
 
Your point #2 above suggests you are confused.

I was not talking at all about nVidia's GPUs. I was talking about the ConnectX series, now made by nVidia, which comes from their Mellanox acquisition.

As for your point #1: Apple faces a stark choice, because Thunderbolt simply doesn't have the bandwidth for connecting hosts as fast as you want to for scaling out AI (much less scaling up). TB is great compared to 10GbE, but not to 100, much less the 200 used in the DGX Spark, or the 400/800 used in bigger servers.

Putting that speed Ethernet in the Studio would be utterly impractical for the base model. But there's an easy solution. After all, they already make 10GbE optional. It would not be a stretch to make an OSFP or QSFP port with a really fast Ethernet chip behind it an option as well. And if they are supporting a mellanox chip already for their PCC servers, it's reasonable to use those for Studios as well, though there are also several other options. Perhaps Tim Cook has decided it's time to bury the hatchet?

Coming back to your second comment... If you think they're not connecting them with TB, then what do you think they're using? And why wouldn't they consider that tech for the studio, at least as an option?
I wrote something but it got erased. Suffice to say to briefly (don't take it personally) answer your question:

RDMA over Thunderbolt is for consumers to connect multiple Macs together to achieve something that could only previously be done via servers.

Apple is using a custom firmware OS they built derived from iOS, and the rumor is that Broadcom is providing the networking capabbility for PCC servers, so I highly doubt NVIDIA is involved. Plus, Apple hates NVIDIA for good reasons. I could be wrong, but I really doubt Apple wants to rely on NVIDIA for anything, especially something they deem important to them given everything they've gone through. Apple sparingly uses NVIDIA for ML research from what I've read, which might explain those APIs. That said I'm willing to be wrong, and I don't really know. It's just everything I've read mentions Apple, Broadcom, and now Google (for training models).
 
RDMA over Thunderbolt is for consumers to connect multiple Macs together to achieve something that could only previously be done via servers.

Consumers? Lol. Not what I'd call anyone buying $10k-$40k of Macs to do AI. In any case, "servers" is a useless label as it means whatever you want it to. And this isn't enabling something qualitatively new - it's a (huge) performance win, but you could do the same thing (much much more slowly) with gigabit ethernet.

Apple is using a custom firmware OS they built derived from iOS, and the rumor is that Broadcom is providing the networking capabbility for PCC servers, so I highly doubt NVIDIA is involved. Plus, Apple hates NVIDIA for good reasons. I could be wrong, but I really doubt Apple wants to rely on NVIDIA for anything, especially something they deem important to them given everything they've gone through. Apple sparingly uses NVIDIA for ML research from what I've read, which might explain those APIs. That said I'm willing to be wrong, and I don't really know. It's just everything I've read mentions Apple, Broadcom, and now Google (for training models).
You remain confused about the difference between nVidia's networking and AI products, I guess.

The mlx driver is for *mellanox* chips. It is not for doing any kind of AI. Well... it's all about AI at this point - what I mean is, those chips are for networking, and thus are used to link GPUs, but they are not doing any of the actual matrix math and other computations. They're used to move data around. So an mlx driver can *only* mean nVidia networking; it can't tell us anything about what GPUs are being used for training (or inference).

Apple's snit with nVidia was always (maybe after the first year or two) embarrassing and more harmful to Apple than nVidia. It's also ancient ridiculous history from a time when each company was a tiny fraction of its current size. It's long past time they got over it.

It's entirely possible that Apple is using Broadcom for scale-out and Mellanox for scale-up, though that would be a little surprising. Less surprising, if it turns out that they're serious about some of those patents they've filed, and the Mellanox is just standing in for better hardware that's not ready yet, because they need to shave off every ns of latency they can, and Mellanox is still better than Broadcom's gear.

As far as I can tell, there are exactly four possibilities that explain the existence of a mellanox driver:
1) Future inclusion in a Studio (likely as an option, not by default)
2) Intent to support this in a future Mac Pro (sadly unlikely)
3) Support in PCC servers
4) Experimental hardware that never sees the light of day

...but #4 is the least likely, as I don't see them building drivers for such hardware into a release OS.
 
Possibly related to Mellanox (or high performance networking in general). The 26.3 releases contain something called “Lattice” which may be related to DPDK: a Data Plane Development Kit for high performance networking.

Take this with a grain of salt as seemingly some of this sleuthing was done with AI, but the person who posted it is usually very reliable (Blacktop on mastodon).

1768331648115.png
 
Wow, that's... surprising. Though perhaps only because I'm not up to date on this.

Last time I looked (a *long* time ago), DPDK was Intel's toolkit for packet processing in userspace. It was intended to improve performance and paralellization of networking tasks, compared to doing it all in the kernel. It was apparently very good at what it does, but I stopped looking at it when I realized it was completely inappropriate for my needs. More recently some competition has emerged, like VPP (and, to some extent, eBPF). But I know even less about that.

So Apple took that code for use in MacOS? I would have bet on them revamping their network stack instead, but perhaps that really is too hard a problem to solve when alternatives already exist.

I think I remember that DPDK required specific support from the device and the device drivers. But I don't think that says anything about the presence of a kernel device driver for Mellanox chips.
 
Wow, that's... surprising. Though perhaps only because I'm not up to date on this.

Last time I looked (a *long* time ago), DPDK was Intel's toolkit for packet processing in userspace. It was intended to improve performance and paralellization of networking tasks, compared to doing it all in the kernel. It was apparently very good at what it does, but I stopped looking at it when I realized it was completely inappropriate for my needs. More recently some competition has emerged, like VPP (and, to some extent, eBPF). But I know even less about that.

So Apple took that code for use in MacOS? I would have bet on them revamping their network stack instead, but perhaps that really is too hard a problem to solve when alternatives already exist.

I think I remember that DPDK required specific support from the device and the device drivers. But I don't think that says anything about the presence of a kernel device driver for Mellanox chips.
Maybe? As I said, I would treat some of this with caution. This screenshot mentions iOS and I’m not sure why this kind of capability would be needed on iOS. I’m sure we’ll get confirmation soon if it is that. Some others have mentioned an acquisition Apple made in 2017 from a startup called “Lattice Data”.

Edit: looking at the Tahoe 26.3 beta there is no sign of Lattice anywhere. Given that, I would doubt the theory I put forward previously. I can’t imagine this feature would only be present in iOS.
 
Last edited:
Consumers? Lol. Not what I'd call anyone buying $10k-$40k of Macs to do AI.
As I've explained already in multiple comments spanning hundreds of words, Apple has provided what used to take 500K in computing costs, or 200K in server credits in a form factor that fits on your desk that's $40K, or less. But the magic is that you can connect multiple Macs together. That's my point. If you and your friend wanted to get together and join your Macs to run a model, you can do that now. Please refer to my earlier analysis explaining just how good this is for consumers.

Well... it's all about AI at this point
I just said Broadcom is providing the networking for Apple’s PCC servers. If you have contrary rumors that suggest Mellanox networking is being used for Apple's PCC, then perhaps they are using NVIDIA's networking chips! And yes, I do know what Mellanox is. Please reread my comment! :)

it's a (huge) performance win, but you could do the same thing (much much more slowly) with gigabit ethernet.

That... is the entire point of RDMA over Thunderbolt: speed. Why the hell would Apple for the first time since 1997 put some esoteric port into the products based solely on the misguided notion that only NVIDIA can innovate in "AI?"
 
Last edited:
As I've explained already in multiple comments spanning hundreds of words, Apple has provided what used to take 500K in computing costs, or 200K in server credits in a form factor that fits on your desk that's $40K, or less. But the magic is that you can connect multiple Macs together. That's my point. If you and your friend wanted to get together and join your Macs to run a model, you can do that now. Please refer to my earlier analysis explaining just how good this is for consumers.
What planet are you on? On mine, consumers are not going to do this.

Back in the day enthusiasts lugged their tower computers to LAN parties because that was the only way to enjoy lag-free multiplayer 3D games with friends. Ain't nobody doing that to run LLMs, there's millions of more interesting ways to spend time together. Even if you desire more AI slop in your life, it's far easier and more convenient to ask one of the usual cloud based suspects to deliver it on demand. No need to wait for the Saturday meetup.

Also, as far as I can tell, 128GB RAM per Mac is the minimum to be interesting when clustering Macs to run large models. As of right now, a 128GB Mac costs a minimum of $3500 in Mac Studio form, $4700 in MacBook Pro. These are not consumer Macs! The average Mac owner spent a fraction of those amounts on a base model Mini or Air with at most 16GB RAM. Clustering those together gets you nowhere.

Your "analysis" isn't. The consistent theme I get out of your posts here is that you're starting from a desired conclusion (which is always an extremely fanboyish take), and trying to spin reality to support it. Thunderbolt RDMA is neat, and I'm glad Apple put it in, but it's never going to do anything for 99.99% of Mac owners.
 
Hi, all. Let’s try and avoid discussing each other personally, but feel free to comment on analysis/facts as you’d like. Thanks!
 
That... is the entire point of RDMA over Thunderbolt: speed. Why the hell would Apple for the first time since 1997 put some esoteric port into the products based solely on the misguided notion that only NVIDIA can innovate in "AI?"
I have no idea how to respond to this. At this point it feels like willful ignorance. "Esoteric port"? It's Ethernet. (Well, not just ethernet, maybe, depending on what Apple chooses to support, as you could get infiniband too. And maybe will, if they really are using this hw.) And you keep on bringing up AI and nVidia, when that is entirely irrelevant. At the hardware level, this is about networking, not AI, even though AI code is the (presumably) major beneficiary of this. I did my best to explain this back in post #67.
 
On mine, consumers are not going to do this.
There is literally an entire community of consumers dedicated to running local machine learning models. I've read a bunch of people doing this. It's a popular topic on social media, believe it or not. And as I said in my analyses before , this is only the beginning. Apple made a major leap forward in democratizing access.

Is every consumer going to? No. Most Mac users don't even use 99% of Mac's features. But from set up to plug and play, Apple's RDMA over Thunderbolt is absolutely consumer use and consumers use it. You can check the plethora of social media posts since that's all that most people care to "talk" about these days -- "AI."

Also, whether you agree that $40,000 for an equivalent to $500K set up is consumer pricing or not, you're completely ignore one major point: you literally cannot buy H200 as a consumer. You can buy Macs and walk into an Apple Store and so yourself up with a consumer equivalent to enterprise server farms. So yeah, it's consumer regardless of opinion on pricing. Oh, and that little thing about memory and GPUs and SSDs becoming 4X more expensive for consumer PCs. At the end of the day, $40,000 is a lot more consumer grade in the context of literally everything than you actually are saying lol

Your "analysis" isn't. The consistent theme I get out of your posts here is that you're starting from a desired conclusion (which is always an extremely fanboyish take), and trying to spin reality to support it

Also to be clear, my analysis I was referring that I literally spent multiple hours researching, writing, and verifying regarding RDMA over Thunderbolt and how it changes the game, and is just the beginning for democratizing access to this for consumers is not my earlier comment spanning 2 sentences. I'm otherwise ignoring this comment.

Apple's snit with nVidia was always (maybe after the first year or two) embarrassing and more harmful to Apple than nVidia. It's also ancient ridiculous history from a time when each company was a tiny fraction of its current size. It's long past time they got over it.
NVIDIA is literally does not care at all about the consumer market and never has. They only operated in it because it generated money, until the next thing came along that generated more money: enterprise servers for whatever bullshit cloud companies push.

Apple's "snit" is that NVIDIA puts zero effort for the end consumer beyond producing a product on spec sheet. This is clearly demonstrated from the time NVIDIA didn't give a damn at all about their chips in Macs causing massive issues, and it continues to this day: 1) horrible pricing, 2) NVIDIA refusing to put more than 32 GB in their products and being behind on nodes despite charging $2000, 3) ports literally melting for multiple generations of their top end GPUs, 4) missing entire parts of the fucking chip (ROPs), 5) NVIDIA claiming massive performance increases and using "AI" to cheat their way through it. Like seriously? They're a chip company, and they somehow manage to screw up every part of it. The only factual part of this snippet is that both companies are well bigger than they were at that point in time.

I have no idea how to respond to this. At this point it feels like willful ignorance. "Esoteric port"? It's Ethernet. (Well, not just ethernet, maybe, depending on what Apple chooses to support, as you could get infiniband too. And maybe will, if they really are using this hw.) And you keep on bringing up AI and nVidia, when that is entirely irrelevant. At the hardware level, this is about networking, not AI, even though AI code is the (presumably) major beneficiary of this. I did my best to explain this back in post #67.
Next time, please stick to addressing original claims made someone makes. You wondered if Apple would adopt QSFP in Macs for some reason related to PCC servers. And then wondered if they would adopt Mellanox switches in PCC. And then started moving into making a bit more definitive claims they would simply because they added Mellanox support in Mac at some point.

I simply replied with a really basic opinion: no, I don't think they're doing either (adopting QSFP on Mac /adopting Mellanox on PCC), and I'm only going off what I have read, rumor wise and Apple interview wise. As I said, my comment citing multiple pieces of information, and interviews as to why I said I didn't think so got erased, and I'm genuinely sorry that I didn't spend time rewriting the entire comment. It surely might have helped. There really doesn't need to be a whole debate about it. I'm allowed to say my opinion and my opinion can be wrong lol.

I never suggested any of the following items: Apple will use Thunderbolt for server connections; QSFP is proprietary to NVIDIA; Mellanox is to do with AI computation or GPU related (beyond networking them).

I directly addressed everything you, @leman, and whoever else said with a short comment from the beginning:

1) I highly doubt Apple is going to add in what is basically very enterprise ports into this consumer product. It's just not something Apple chases. They are focused on Thunderbolt and for good reason

2) PCC has already been confirmed to be fully Apple silicon, Internal transformer models are trained via Google TPU hardware, primarily, with some NVIDIA.

And I later clarified that Broadcom is doing the networking, according to rumors. I could be wrong! Rumors are stupid!

But for purposes of discussion, we cant ignore that the critical detail that Broadcom, not NVIDIA no matter how many drivers Apple puts into Mac, is the widely rumored partner for networking. Oh, and the little fact that PCC servers are verified as stateless. The reason it's full Apple silicon and Apple is partnering with Broadcom on networking is because there's no way in hell NVIDIA is going to just give up all internal IP and full access to Apple to make sure PCC claims are verified. It must be full Apple silicon with Broadcom networking.

You and @leman and @mr_roboto can research for yourselves how Apple is partnering with Broadcom, not NVIDIA for networking; reason that if NVidia was even hinted at being involved with PCC that would be all over Wall Street commentary and news articles; that PCC servers make certain, guaranteed privacy claims, and come to a similar conclusion that it's highly unlikely NVIDIA is involved in PCC. I guess I made the cardinal sin of saying "I highly doubt" based on stuff I read.

I am a fan of what I've written comments about. I really like what achievements Apple is making, the progress they are making, and no one else has written on the impact it has. An example being Apple introduced RDMA over Thunderbolt for users, and I explored what possibilities that opened up. I am allowed to like what they're doing. I can analyze something, and demonstrate I'm a fan of it. That doesn't invalidate my points just because I like their work! :)

I'm ending my part of this discussion here. Thanks for reading my points if you made it this far.
 
Last edited:
And if you want even more reasoning on why I characterized QSFP as enterprise and esoteric:

QSFP ports by themselves cost a metric ton of money; require cooling and a lot more space than Thunderbolt; and can't just simply be used in the same way as Thunderbolt, because it requires enterprise grade switches that interface with QSFP, which are also extremely expensive, take up even more space, and run extremely noisy and hot. Furthermore, QSFP is not plug and play. There is a lot of set up involved with many different options, and idling it draws literally 10X the electricity the Mac does when idling, all for a switch that you're only using for RDMA on a hypothetical Mac. You need expertise to understand, pick the right stuff, and operate it. This is the definition of esoteric and enterprise. This isn't consumer use. The presence potentially of QSFP to network PCC servers together plus Apple doing RDMA for LLMs on Mac does not mean they will up and abandon their philosophy of consumer products. They seek simplicity and ease of use. Even PCC reflects this by providing high power and high privacy with zero effort on the user, even if they choose to use high grade Ethernet or whatever other networking.

Throughout all of my comments on this thread, I've been heavily detailing what Apple believes and from it entertaining how this might affect consumers. I said repeatedly Apple is about simplicity and democratizing access to advanced technology. RDMA over Thunderbolt for LLMs is both a beginning and a pure expression of this philosophy. Going from $500K server farms that are literally only accessible to a company no matter how much money a consumer has, to literally putting all of that power into something beautiful on your desk and is remarkably easy is what Apple sought and achieved to do.

I understand the potential reasoning, and I was enthusiastic to talk about it, and I sought to make my point clear and short; but my points and myself have been repeatedly mischaracterized, which led me to further and further explain my position. And that's sad to me, because I'm pretty sure the conversation was genuine in wanting to discuss whether Apple would use QSFP on a Mac.
But, It's simply unfair for anyone to ignore everything I say and misinterpret it, and then criticize my comments and dismiss them as crap analysis and "fanboyism" despite me being respectful in this. I hope this both explains my original stuff further and also why I wrote so much.

This is not in response to any commenter in particular but generally on this thread.
 
Last edited:
Gah, it's not even worth doing a point-by-point. But in case anyone wants some QSFP (QSFP*what*? 28? 56? 112? DD or not? there's a VERY big difference) gear that's not "enterprise and esoteric", Mikrotik has some switches starting at a list price of $599. Much bigger ones available mail order for under $2k. So, cheaper than 10GbE was until pretty recently.

Oh, and if you're willing to go used, you can do what I did - buy a few of these Dell switches. They're frequently under $500 including shipping for 6 100gbps and 48 25gbps ports.

At this rate we're soon going to be hearing about how LPDDR6 is coming to the M4. Sigh.
 
Ugh, that clown. And they keep commenting (on other forums) even with everything being predicted ending up wrong.

About 10 years ago some dude bought a bunch of billboards around here predicting a very specific date for armageddon. It came and passed, but he’s still out there collecting donations from suckers.
 
https://www.broadcom.com/company/news/product-releases/63146

https://investors.broadcom.com/news...troduces-industrys-first-800g-ai-ethernet-nic

Extremely interesting, exciting, and relevant given my previous mention on Broadcom being the major partner rumored for networking on Private Cloud Compute.

Apple characteristically doesn't use off the shelf parts, but it gives a bit of an expectation of what Broadcom is capable of engineering for networking. Partnering with Broadcom also allows PCC to be stateless as Apple can create and verify from the ground up how each chip works.

My guess, as from the beginning:

Apple will utilize Thunderbolt 5 (with 10 Gb standard Ethernet) on Mac for the foreseeable future, but Apple will work with Broadcom on completely custom networking for PCC servers with performance in the range of Tomahawk 6 / Thor Ultra; both can use InfiniBand Verbs API.
Both are just the start, and together they will enable unprecedented simplicity, privacy, power, and democratization of "AI" for consumers. You don't need NVIDIA to be a leader in "AI."

(And yes, Mellanox is networking, not GPUs, as I've literally argued from the beginning. Read what I've said carefully through out all my comments here.)

For info on PCC:
 
Last edited:
Back
Top