Apple open sources on device AI models - your thoughts?

tomO2013

Power User
Joined
Nov 8, 2021
Posts
118
I read over at the other place that Apple has released a decoder-only transformer open language model that can run on device - OpenELM.

The premise behind the project is something that I find quite interesting and it gives us some insight as to the strategy that Apple wants to take with ML. Not least with respect to M4 its increasing evidence that Apple will likely invest any additional available transistor budget to bolstering ML (beyond all the existing marketing / hype /rumors we're hearing about iOS18 and smart(er) Siri).



https://arxiv.org/pdf/2404.14619





 
Last edited:
These are all fairly small models, too small to be really useful IMO. It can certainly be used for some basic text summarization tasks or similar, but the performance of these models does not even come close to the current state of the art. Probably enough to have Siri generate relatively naturally sounding responses however.
 
too small to be really useful IMO
Well yes, but it depends what you are looking for.

I could imagine a potentially valuable subset of AI, using M- or even S-LMs instead of LLMs.
These would be much more limited/focused on a specific application, like a tool in a tool belt.

They would be much more likely to near-completely cover an issue/question with a much small data footprint requirement - and therefor lower storage needs and computational demands. Also much less likely to hallucinate due to poor data coverage of the question asked.

Interesting economic side effect: Since this would not need mega-installations, the entire "tool" fitting on a portable device not requiring a tether to a remote server farm, the creator/vendor would have full control of the tool, not having to share data and programming with other players...
 
Well yes, but it depends what you are looking for.

I could imagine a potentially valuable subset of AI, using M- or even S-LMs instead of LLMs.
These would be much more limited/focused on a specific application, like a tool in a tool belt.

They would be much more likely to near-completely cover an issue/question with a much small data footprint requirement - and therefor lower storage needs and computational demands. Also much less likely to hallucinate due to poor data coverage of the question asked.

Interesting economic side effect: Since this would not need mega-installations, the entire "tool" fitting on a portable device not requiring a tether to a remote server farm, the creator/vendor would have full control of the tool, not having to share data and programming with other players...

If your problem is well defined and small enough, it could work. I wonder how much it costs to train a model like that.

A very interesting part of this work is that it was done using Apple's own training framework: https://github.com/apple/corenet
 
Apple’s been building its on-device ML capabilities for some years, and OpenELM seems a logical progression.

Some, right now, actively use LLMs to condense large documents into short summaries. But by feeding such documents to a server-hosted LLM they’re making a free gift of material that is often under copyright! If and when Apple (or anyone else) succeed in keeping this on-device then it’s a return to fair use.

I’m cautiously optimistic about Apple’s approach, but with the usual caveats around LLMs’ uncontrolled tendency to present unwarranted conclusions (‘haluccinate’).
 
Some, right now, actively use LLMs to condense large documents into short summaries. But by feeding such documents to a server-hosted LLM they’re making a free gift of material that is often under copyright! If and when Apple (or anyone else) succeed in keeping this on-device then it’s a return to fair use.

To do these kinds of tasks (and also other things, like programming assistance), you need larger models. I really hope we will be able to run 2-3 trillion parameter models on device, that will still take some time ^^ Although, you could probably fit a quantized 1 trillion parameter model into 10GB or less, which becomes feasible with some smart caching architecture...
 
These are all fairly small models, too small to be really useful IMO. It can certainly be used for some basic text summarization tasks or similar, but the performance of these models does not even come close to the current state of the art. Probably enough to have Siri generate relatively naturally sounding responses however.
To be fair, what I actually want out of on device intelligence is small assistance stuff. Work/appointment scheduling assistance, intelligent responses to inbound mail/requests/etc. while I'm busy trying to focus, etc.

Larger content generation workloads that need more processing are better off pushed to a cloud server that isn't dependent on my dinky little mobile device battery for power.
 
Back
Top