New ML framework from Apple called MLX


Might explain why the hype has died down a bit. Still impressive to see what Apple has accomplished in a relatively short period of time. The question for me is alway: “What do they have planned for the *next* release?”

Of course there’ll be the skeptics, and while it’s true, Apple doesn’t have limitless engineering resources (obviously), we don’t know exactly what they leverage and to what extent tbh. On the other hand, we do know how Apple was run under Jobs (generally speaking) and I think it was said best when the iPhone dropped. “You skate to where the puck is going to be. Not to where it’s been” and I believe we’ve been seeing that play out.

There was a time people believed Apple was done. No one predicted a turnaround, let alone the milestones getting them to the Trillion dollar valuation. No one predicted the iPhone or the iPod before that. Years ago folks were laughed at on tech forums for even surmising whether or not Apple could design their own silicon or if they should. I’m sure some here might remember those discussions and yet here we are.

We don’t yet know if there will be an M3 Ultra, but aside from that, what’s going to be even more interesting is what’s next with 2nm M4/M5 or what effort will be put into their GPUs specifically. Time will tell. I still think it’s a fantastic showing for a company relatively new to the space.
 

Might explain why the hype has died down a bit. Still impressive to see what Apple has accomplished in a relatively short period of time. The question for me is alway: “What do they have planned for the *next* release?”

Of course there’ll be the skeptics, and while it’s true, Apple doesn’t have limitless engineering resources (obviously), we don’t know exactly what they leverage and to what extent tbh. On the other hand, we do know how Apple was run under Jobs (generally speaking) and I think it was said best when the iPhone dropped. “You skate to where the puck is going to be. Not to where it’s been” and I believe we’ve been seeing that play out.

There was a time people believed Apple was done. No one predicted a turnaround, let alone the milestones getting them to the Trillion dollar valuation. No one predicted the iPhone or the iPod before that. Years ago folks were laughed at on tech forums for even surmising whether or not Apple could design their own silicon or if they should. I’m sure some here might remember those discussions and yet here we are.

We don’t yet know if there will be an M3 Ultra, but aside from that, what’s going to be even more interesting is what’s next with 2nm M4/M5 or what effort will be put into their GPUs specifically. Time will tell. I still think it’s a fantastic showing for a company relatively new to the space.
I’m sorry if I’m not quite following: your previous post made it sound like your were theorizing that Apple were deliberately holding back perfectly ready silicon designs that could otherwise be released right now. That I was pushing back against with my “Apple doesn’t have limitless engineering resources” comment.

However, if your argument is simply that Apple plans out their silicon releases and is preparing the field with software first, then yes that makes sense. With their engineering resources they pursued hardware accelerated ray tracing first, but I do agree that we will see a rapid expansion of hardware accelerated ML capabilities, especially training, in the next couple of releases - definitely by the expected 2nm chips which should be the M5 if everything pans out. And indeed the development of Apple’s accelerated ray tracing technology followed the same pattern, software them hardware.
 
Last edited:
With the Whisper fiasco fresh in my mind, I tentatively venture into the MLX performance comparisons again. As always, take them with a grain of salt.

There is a Medium post: https://towardsdatascience.com/mlx-vs-mps-vs-cuda-a-benchmark-c5737ca6efc9?gi=b169ee90b4c4

It requires a login (free) but I’ll repeat some of the findings here. The test is a Graph Convolutional Network (GCN) model. The writer tested an M1 Pro for the first three and two Nvidia V100s for the last two:

a) CPU
b) GPU using MPS
c) MLX
d) NVIDIA TESLA V100 PCIe
e) NVIDIA TESLA V100 NVLINK

The results are:
1702664100102.png


MPS: 1.44x faster than CPU, not bad.

MLX: 4.52x faster than MPS, 6.5x faster than CPU, this is some serious stuff!

CUDA V100 PCIe & NVLINK: 2.35x and 2.57x faster than MLX, respectively.

The above is quoted from the article. His summary is:

To recap​

Cool things:
  • We can now run deep learning models locally by leveraging the full power of Apple Silicon.
  • The syntax is pretty much similar as torch, with some inspirations from Jax.
  • No more device, everything lives in unified memory!
What’s missing:
  • The framework is very young, many features are missing yet. Especially for Graph ML, all sparse operations and scattering APIs are not available at the moment, making it complicate to build Message Passing GNNs on top of MLX now.
  • As a new project, it’s worth noting that both the documentation and community discussions for MLX are somewhat limited at present.
Worth checking out.
 
dada_dave:

You’re referring to this: “I just keep thinking that Apple is ranging the competition, seeing how they react and what they release in response knowing that what they’re keeping under wraps is *already* ahead. YMMV.”

Aside from the obvious disclaimer at the end, the fact that I was clearly speculating. I assumed that would have sufficed. Nevertheless, it doesn’t take away from the gist of my thoughts.

Clearly Apple skates to where the puck is going to be. I would imagine they’ve been testing the competition quite thoroughly, and know the limitations and advantages of the competing tech quite well. I apologize if I came across as Apple already “having working tech” vs. “keeping the [roadmap] under wraps”
 
Fact of the matter is MacBook Pro is the fastest notebook money can buy for these tasks. Sure, you can get a 4090 in a windows lapt— I mean, a 4090 mobile version lol. Except you can’t even run that at full speed, so for as much pontification there is about how it’s slower, it really isn’t.
Furthermore, unified memory will let you fit in models that simply are not possible to manufacture on a traditional GPU.

The fact that I can do this specific task in 100 seconds flat no matter where I am and I can run this program back-to-back-to-back-to-back at that speed for hours at a time is simply blow away.
 
Fact of the matter is MacBook Pro is the fastest notebook money can buy for these tasks. Sure, you can get a 4090 in a windows lapt— I mean, a 4090 mobile version lol. Except you can’t even run that at full speed, so for as much pontification there is about how it’s slower, it really isn’t.
Furthermore, unified memory will let you fit in models that simply are not possible to manufacture on a traditional GPU.

The fact that I can do this specific task in 100 seconds flat no matter where I am and I can run this program back-to-back-to-back-to-back at that speed for hours at a time is simply blow away.

That'a bit of an exaggeration though, isn't it? I think we are all very exited about Apple Silicon, but let's not get too carried away. Sure, M3 max can get ahead when you are working with very large models, but then again you probably want a big cloud machine for that. In the vast majority of currently practically relevant cases models are considerably smaller, and Nvidia does offer much more raw oomph for machine learning.

Good news however is that Apple now has all the infrastructure in place to significantly improve on-GPU machine learning performance over the next few generations. I fully expect them to match at least Ampere in FP16/BF16 in the next 2-3 years, which in combination with unified memory will be sufficient to challenge Nvidia's hegemony in on-device machine learning at least.
 
iOS and macOS are differentiated In the markets like say gaming. So iOS being massive doesn’t automatically make macOS an attractive target. That means that some developers will literally develop for iOS, Windows, Android, etc … but not macOS. Apple is attempting to change that.
It's crazy that some extremely popular games that work on iOS don't have a macOS version at all. Not sure what kind of business reasoning went behind that but it should be extremely easy to port to macOS once you have a iOS version (ie Genshin Impact).

Sure, you can get a 4090 in a windows lapt— I mean, a 4090 mobile version lol. Except you can’t even run that at full speed, so for as much pontification there is about how it’s slower, it really isn’t.
Slight offtopic here, sorry to quote you specifically :p but I don't love how lately mobile chips are referred to as "not reaching its full potential","not running at full speed" and similar. I see it a lot in YouTube channels that review gaming PCs. I think it's almost universally true that if you take a chip designed with power/thermal constraints in mind and remove those power/thermal constraints, you may end up with a chip that performs much better. But that's a completely different use case!
 
"That'a bit of an exaggeration though, isn't it? I think we are all very exited about Apple Silicon, but let's not get too carried away. "

I think we're getting a bit off-topic, but the fact of the matter is that there was a time when Apple was written off as dead, and to some degree they still are depending on the perspective and the market. Like I mentioned earlier, macOS still holds many millions of seats, so lets not get caught up in "percentages". We all know 12% of a very large number is still a very large number.

When I look to target something I focus on "concentration". Windows is huge no doubt, but the competition—along with the somehat paralyzing choices that already exist—may not be the best market to play in—I know it's seductive, and maybe it's the best choice for some things. However, I'd much rather play in an underserved market where less competition exists (you have to start somewhere if you want market demand to grow it's just plain easier to compete. This will always be the case for Apple right up and until iOS, macOS are so blured that the argument of "Windows because it's bigger" will fade. In fact, It's probably already a given that iOS devices already exceed the total number of active user-facing Windows devices, so forgive me if I seem skeptical of the same old tired arguments. Still, there's likely more at play.

Many companies probably haven't forgotten what Apple did with music... iPod, iTunes, inexpensive and safe a la carte downloads that you owned and could tote around etc. They don't want Apple to repeat that dominance, so they're skeptical. Witness the recent CarPlay efforts. The OEMs are leery. But what is Apple to do if their users are being underserved? They're now forced to roll their own. Years ago, the OEMs and competition could hand-waive them away, and laugh them off. That's no longer the case given Apple's immense size and amount or resources. However, size and resources alone aren't enough, but as we can see, Apple is pretty good at executing, and the competition knows it and they get a bit more concerned when Apple branches out into new things.

I think people forget that this isn't the same Apple of the 80s and 90s when they were small and reliant on others; when they were considered insignificant. Intel, Motorola, IBM, Nvidia. . . Any of the third-party hardware companies. . . All of them said that the platform wasn't worth the effort, and maybe that's true. But if these companies don't want to offer competent wares for platforms other than Windows, then Apple will be forced to roll their own. The difference now is that Apple cannot be ignored and these companies are now taking notice. The fact is that people were literally laughed off of tech forums when they suggested Apple might one day roll their own stuff. Yet, here we are with Apple Silicon being but one.
 
Not to derail the thread, but It's almost a great thing that all these third party hardware and software companies ignored Apple for all these years because we likely wouldn't have the Apple devices we have today, and we certainly wouldn't have Apple Silicon. Maybe it's a good thing that Intel, IBM, Nvidia etc didn't offer equally capable products for Apple kit over the years. It's certainly seems that way to me. They're taking notice now.
 
Not to derail the thread, but It's almost a great thing that all these third party hardware and software companies ignored Apple for all these years because we likely wouldn't have the Apple devices we have today, and we certainly wouldn't have Apple Silicon. Maybe it's a good thing that Intel, IBM, Nvidia etc didn't offer equally capable products for Apple kit over the years. It's certainly seems that way to me. They're taking notice now.
I’m not convinced this is what happened. I can’t speak about pre-2000 Apple but Intel seemed to offer Apple the chips they wanted. They even made an effort to improve igpus at Apples request. They just weren’t very good at them!
 
Last edited:
I’m not convinced this is what happened. I can’t speak about pre-2000 Apple but Intel seemed to offer Apple the chips they wanted. They even made an effort to improve igpus at Apples request. They just weren’t very god at them!
Apple first approached Intel to do a chip for the original iPhone but Intel was supposedly unwilling to make a chip* with the required specifications for what they considered a niche device. So Apple went with Samsung’s ARM chips and the rest is history.

*it should be noted that this would’ve still likely been an ARM chip as Intel were themselves developing them after acquiring StrongARM from DEC but they sold their StrongARM derived Xscale business off in 2006. So the end result might have been the same. Maybe. Of course had Intel taken the iPhone proposal seriously and not sold their ARM division, who knows?

Edit: BTW if you want a good laugh I found a tech crunch article that details some of this history but also makes some predictions that in hindsight may not have been so accurate about how ARM chips will never rival x86 in the near future for raw performance and also Intel will always have process leadership.


First, there’s simply no way that any ARM CPU vendor, NVIDIA included, will even approach Intel’s desktop and server x86 parts in terms of raw performance any time in the next five years, and probably not in this decade. Intel will retain its process leadership, and Xeon will retain the CPU performance crown. Per-thread performance is a very, very hard problem to solve, and Intel is the hands-down leader here. The ARM enthusiasm on this front among pundits and analysts is way overblown—you don’t just sprinkle magic out-of-order pixie dust on a mobile phone CPU core and turn it into a Core i3, i5, or Xeon competitor. People who expect to see a classic processor performance shoot-out in which some multicore ARM chip spanks a Xeon are going to be disappointed for the foreseeable future.
It’s also the case that as ARM moves up the performance ladder, it will necessarily start to drop in terms of power efficiency. Again, there is no magic pixie dust here, and the impact of the ISA alone on power consumption in processors that draw many tens of watts is negligible. A multicore ARM chip and a multicore Xeon chip that give similar performance on compute-intensive workloads will have similar power profiles; to believe otherwise is to believe in magical little ARM performance elves.

🙃

True the author originally wrote that 2011 but even so he quoted it in 2016 and the end of the decade saw the introduction of the M1 and of course a host of ARM based server systems. And well we know what happened to Intel’s process leadership. To his credit both in the original article he wrote in 2011 for Ars and the one I linked to he does layout why ARM is such a threat to Intel regardless but still …
 
Last edited:
Apple first approached Intel to do a chip for the original iPhone but Intel was supposedly unwilling to make a chip* with the required specifications for what they considered a niche device. So Apple went with Samsung’s ARM chips and the rest is history.

*it should be noted that this would’ve still likely been an ARM chip as Intel were themselves developing them after acquiring StrongARM from DEC but they sold their StrongARM derived Xscale business off in 2006. So the end result might have been the same. Maybe. Of course had Intel taken the iPhone proposal seriously and not sold their ARM division, who knows?
I don’t doubt all of that is true, and I don’t think that qualifies as “ignored”. Intel etc are free to make business decisions that are in their perceived interest. Intel provided Apple with cpus and integrated gpus often ahead of other customers (money!) where it was in their interest as well as Apple’s. I don’t think their shortsightedness on mobile cpus is a product of them ignoring Apple, just their belief on the future of the industry. Unless there is an example of them doing that for other companies?
 
I don’t doubt all of that is true, and I don’t think that qualifies as “ignored”. Intel etc are free to make business decisions that are in their perceived interest. Intel provided Apple with cpus and integrated gpus often ahead of other customers (money!) where it was in their interest as well as Apple’s. I don’t think their shortsightedness on mobile cpus is a product of them ignoring Apple, just their belief on the future of the industry. Unless there is an example of them doing that for other companies?
I suppose. I guess I would qualify that as ignoring Apple not out of malice or belief that Apple as a whole was worth ignoring but out of short sightedness certainly. Btw did you read the edit to my post? It’s amusing.
 
I suppose. I guess I would qualify that as ignoring Apple not out of malice or belief that Apple as a whole was worth ignoring but out of short sightedness certainly. Btw did you read the edit to my post? It’s amusing.
Lol yeah I did. I don’t have quotes, but I still regularly see on forums and reddit that the benchmarks showing Apple Silicon beating or matching an Intel/AMD chip must be wrong because logically a “phone” chip can’t beat a desktop one!
 
Lol yeah I did. I don’t have quotes, but I still regularly see on forums and reddit that the benchmarks showing Apple Silicon beating or matching an Intel/AMD chip must be wrong because logically a “phone” chip can’t beat a desktop one!
I will also add that it appears that Intel’s solution to combat ARM is to further change the x86 ISA to be more ARM-like. (In some ways)

So Intel at least feels that ISA matters enough to change their own …

Edit: @Eric posting a raw link to another techboard thread seems to result in a server error. Using the link tool to turn text into a link works.
 
Apple first approached Intel to do a chip for the original iPhone but Intel was supposedly unwilling to make a chip* with the required specifications for what they considered a niche device. So Apple went with Samsung’s ARM chips and the rest is history.

*it should be noted that this would’ve still likely been an ARM chip as Intel were themselves developing them after acquiring StrongARM from DEC but they sold their StrongARM derived Xscale business off in 2006. So the end result might have been the same. Maybe. Of course had Intel taken the iPhone proposal seriously and not sold their ARM division, who knows?

Edit: BTW if you want a good laugh I found a tech crunch article that details some of this history but also makes some predictions that in hindsight may not have been so accurate about how ARM chips will never rival x86 in the near future for raw performance and also Intel will always have process leadership.




🙃

True the author originally wrote that 2011 but even so he quoted it in 2016 and the end of the decade saw the introduction of the M1 and of course a host of ARM based server systems. And well we know what happened to Intel’s process leadership. To his credit both in the original article he wrote in 2011 for Ars and the one I linked to he does layout why ARM is such a threat to Intel regardless but still …
I went looking for a follow up to see if Jon Stokes acknowledged that he was wrong. The only thing I found was a quote from him after the WWDC Apple Silicon announcement on a badly mangled website.

Intel’s delay in introducing 10 nm processors may have contributed to Apple’s decision to go their own way. “Having their own microprocessor architecture is something they’ve wanted to do since the Jobs era, for sure, to not be beholden to an outside partner,” says Jon Stokes, author of Inside the Machine: An Illustrated Introduction to Microprocessors and Computer Architecture, and co-founder of technology site Ars Technica. “I think the tipping point was when ARM started to catch up to Intel in … performance, and Intel stalled in processor leadership.”

So he seems to have come around to the idea that Arm could actually catch up to Intel after all.
 
I’ve been looking for that video I seen some years ago where the presenter boldly stated: “x86 is dead they just don’t know it yet”.

By the time they realize it, it’ll be too late.
 
That'a bit of an exaggeration though, isn't it? I think we are all very exited about Apple Silicon, but let's not get too carried away. Sure, M3 max can get ahead when you are working with very large models, but then again you probably want a big cloud machine for that. In the vast majority of currently practically relevant cases models are considerably smaller, and Nvidia does offer much more raw oomph for machine learning.

Good news however is that Apple now has all the infrastructure in place to significantly improve on-GPU machine learning performance over the next few generations. I fully expect them to match at least Ampere in FP16/BF16 in the next 2-3 years, which in combination with unified memory will be sufficient to challenge Nvidia's hegemony in on-device machine learning at least.

Thank you for your reply.

sorry, but no, it’s not an exaggeration. I took that benchmark off of what the thread of was talking about. 100 seconds on M3 Max vs 8 seconds on the highest end nvidia 4090. You can’t get 4090 performance in a laptop unconstrained on battery. You just can’t. You can do work with large models not needing the cloud on your MacBook Everything? No, not right now. But im not discounting this, sorry. It’s an impressive accomplishment to be able to do this on a laptop, and your comment is ignoring the fact that it’s not stagnant. If Apple followed the kind of logic, there would be no improvements whatsoever to this stuff, because it can be “done in the cloud.” I’m not saying you're creating a brand new algorithm With trillions of parameters. But the Fact of the matter is you can do transcription on The go in 100 secs flat wherever you are. can’t claim the same for windows, so my comment stands.

Sidenote,
Also where is the same logic applied to Nvidia GPUs running this? i could easily say what you said regarding doing it in the cloud in regards to being able to run this on Nvidia GPUs faster, technically, than m3 max on a MacBook. That kind of conversation just makes zero sense to me, no offense. If we are going to say that about Apple silicon then we can equally say that about running stuff locally on a machine with Nvidia, and then this whole conversation is pointless and for naught. there is benefit to being able to work with models locally, and the fact is you can do stuff on a MacBook that you just can’t with a windows laptop on battery. I’m willing to be proven wrong, so if you can find me a windows laptop that runs full speed on the battery doing this stuff with models larger than 24 GB, I’m happy to delete my account! haha
 
Last edited:
Back
Top