Speaking of the M4 Ultra, this is behind a paywall ( https://asia.nikkei.com/Business/Te...xconn-to-produce-servers-in-Taiwan-in-AI-push ) but, according to MR's summmary ( https://forums.macrumors.com/thread...s-next-year-after-m2-ultra-this-year.2442148/ ), Apple will be replacing the M2 Ultra with M4 chips (which I assume eventually means the M4 Ultra) in its AI servers.
If true, that means their internal AI server development work will continue; the alternative would be Apple giving up on having its own AI severs, and farming this out to someone like Google.
I'm wondering what the relative volume of M2 Ultra chips going into the AI servers vs. the Macs has been thus far, and how that will change going forward.
I've read reports that, while Apple is trying to develop an LLM that will enable most requests to be processed on-device (this was probably a key part of their decision to increase the base RAM to 16 GB), cloud connectivity will still be required for more demanding requests, hence the need for the AI servers.
...which leads to another interesting question: Will the decision whether to process requests locally or remotely sometimes depend on device capability? E.g., might some requests that would be sent the cloud from a base M4 be processed locally on an M4 Ultra?
If true, that means their internal AI server development work will continue; the alternative would be Apple giving up on having its own AI severs, and farming this out to someone like Google.
I'm wondering what the relative volume of M2 Ultra chips going into the AI servers vs. the Macs has been thus far, and how that will change going forward.
I've read reports that, while Apple is trying to develop an LLM that will enable most requests to be processed on-device (this was probably a key part of their decision to increase the base RAM to 16 GB), cloud connectivity will still be required for more demanding requests, hence the need for the AI servers.
...which leads to another interesting question: Will the decision whether to process requests locally or remotely sometimes depend on device capability? E.g., might some requests that would be sent the cloud from a base M4 be processed locally on an M4 Ultra?
Last edited: