M4 Mac Announcements

theorist9 · Nov 6, 2024

Speaking of the M4 Ultra, this is behind a paywall ( https://asia.nikkei.com/Business/Te...xconn-to-produce-servers-in-Taiwan-in-AI-push ) but, according to MR's summmary ( https://forums.macrumors.com/thread...s-next-year-after-m2-ultra-this-year.2442148/ ), Apple will be replacing the M2 Ultra with M4 chips (which I assume eventually means the M4 Ultra) in its AI servers.

If true, that means their internal AI server development work will continue; the alternative would be Apple giving up on having its own AI severs, and farming this out to someone like Google.

I'm wondering what the relative volume of M2 Ultra chips going into the AI servers vs. the Macs has been thus far, and how that will change going forward.

I've read reports that, while Apple is trying to develop an LLM that will enable most requests to be processed on-device (this was probably a key part of their decision to increase the base RAM to 16 GB), cloud connectivity will still be required for more demanding requests, hence the need for the AI servers.

...which leads to another interesting question: Will the decision whether to process requests locally or remotely sometimes depend on device capability? E.g., might some requests that would be sent the cloud from a base M4 be processed locally on an M4 Ultra?

Cmaier · Nov 6, 2024

theorist9 said:
Speaking of the M4 Ultra, this is behind a paywall ( https://asia.nikkei.com/Business/Te...xconn-to-produce-servers-in-Taiwan-in-AI-push ) but, according to MR's summmary ( https://forums.macrumors.com/thread...s-next-year-after-m2-ultra-this-year.2442148/ ), Apple will be replacing the M2 Ultra with M4 chips (which I assume eventually means the M4 Ultra) in its AI servers.

If true, that means their internal AI server development work will continue; the alternative would be Apple giving up on having its own AI severs, and farming this out to someone like Google.

I'm wondering what the relative volume of M2 Ultra chips going into the AI servers vs. the Macs has been thus far, and how that will change going forward.

I've read reports that, while Apple is trying to develop an LLM that will enable most requests to be processed on-device (this was probably a key part of their decision to increase the base RAM to 16 GB), cloud connectivity will still be required for more demanding requests, hence the need for the AI servers.

...which leads to another interesting question: Will the decision whether to process requests locally or remotely sometimes depend on device capability? E.g., might some requests that would be sent the cloud from a base M4 be processed locally on an M4 Ultra?

I assume the differentiator would be the amount of memory to hold the model, and not the processing capabilities of the chip?

Nycturne · Nov 6, 2024

theorist9 said:
I've read reports that, while Apple is trying to develop an LLM that will enable most requests to be processed on-device (this was probably a key part of their decision to increase the base RAM to 16 GB), cloud connectivity will still be required for more demanding requests, hence the need for the AI servers.

...which leads to another interesting question: Will the decision whether to process requests locally or remotely sometimes depend on device capability? E.g., might some requests that would be sent the cloud from a base M4 be processed locally on an M4 Ultra?

A lot of what Apple is doing is with 3B SLMs, which should keep things manageable for the on-device scenarios. While I don’t have much insight on how much RAM these use during inference, I would not be surprised if it is close to 2GB. That depends on how much they can shrink the model using adapters for the different tasks. You don’t really want a feature that needs 25% of your RAM every time you want to summarize a notification or re-tone an email (maybe on iOS you can get away with this), and you likely want the model resident in memory to handle requests on the fly whenever there’s enough memory. That’s more where the RAM bump comes from I think.

So no, I don’t think the M4 Ultra will handle more requests locally. That complicates the engineering in ways which seem very unlikely.

Andropov · Nov 6, 2024

B01L said:
No need for AGP these days, the GPU is now integrated to the ASi chip...

Hey, the upgradability was nice. I had one upgraded from 400 MHz PowerPC G4 7400 + Nvidia GeForce 2MX to a 1.8GHz G4 7447A + Nvidia GeForce 6200.

B01L said:
I would think an all-new Mac Pro Cube would basically be a taller variant on the Mac Studio, to allow for the larger cooling subsystem a Mn Extreme chip would require...

I don't think we'll ever get anything as close to the Cube spiritually as the trashcan Mac Pro unfortunately.

theorist9 · Nov 6, 2024

Cmaier said:
I assume the differentiator would be the amount of memory to hold the model, and not the processing capabilities of the chip?

Yeah, I suspected that as well when I was thinking about the base M4 vs. the Ultra.

Nycturne said:
A lot of what Apple is doing is with 3B SLMs, which should keep things manageable for the on-device scenarios. While I don’t have much insight on how much RAM these use during inference, I would not be surprised if it is close to 2GB. That depends on how much they can shrink the model using adapters for the different tasks. You don’t really want a feature that needs 25% of your RAM every time you want to summarize a notification or re-tone an email (maybe on iOS you can get away with this), and you likely want the model resident in memory to handle requests on the fly whenever there’s enough memory. That’s more where the RAM bump comes from I think.

In Dec 2023, Apple engineers published a paper proposing a more efficient way to split LLM parameter storage between DRAM and SSD so they could run larger (14 GB) LLMs on-device on models with limited RAM [1]. So while Apple's production on-device LLMs may be smaller, 2 GB could be an underestimate. I.e., Apple's way to avoid using too much RAM may be SSD caching rather than simply limiting the model size. Of course, Apple publishes a lot of stuff they don't implement, so it's possible they will not do this.

But if they do, then the difference in LLM operation between a large-RAM and a small-RAM device may not be on-device processing vs. sending to the cloud, but rather being able to keep the model resident in RAM vs. having to split it between the RAM and SSD.

"Currently, the standard approach is to load the entire model into DRAM (Dynamic Random Access Memory) for inference (Rajbhandari et al., 2021; Aminabadi et al., 2022). However, this severely limits the maximum model size that can be run. For example, a 7 billion parameter model requires over 14GB of memory just to load the parameters in half-precision floating point format, exceeding the capabilities of most personal devices such as smartphones."

[1] Alizadeh K, Mirzadeh I, Belenko D, Khatamifard K, Cho M, Del Mundo CC, Rastegari M, Farajtabar M. LLM in a flash: Efficient large language model inference with limited memory. arXiv preprint arXiv:2312.11514. 2023 Dec 12.

Link: https://arxiv.org/pdf/2312.11514

Nycturne · Nov 6, 2024

theorist9 said:
In Dec 2023, Apple engineers published a paper proposing a more efficient way to split LLM parameter storage between DRAM and SSD so they could run larger (14 GB) LLMs on-device on models with limited RAM [1]. So while Apple's production on-device LLMs may be smaller, 2 GB could be an underestimate. I.e., Apple's way to avoid using too much RAM may be SSD caching rather than simply limiting the model size. Of course, Apple publishes a lot of stuff they don't implement, so it's possible they will not do this.

Even if we don't see this at the user level, this could be useful in datacenter, and may be where Apple is thinking of deploying this (if they haven't already). Think of an Mn Ultra with a handful of these secure VMs running on it. Minimizing RAM usage there means you can host more in parallel on a node and reduce costs.

That said, I'd probably need to see a use case where it makes sense to use the amount of RAM and disk space for an LLM on-device for the end user. The SLMs in iOS 18/macOS 15 are larger in terms of parameters than many of the GPT-3 models as it is (outside of the larger curie and davinci models), and the adapters on top should make them more capable than their parameter count alone suggests.

I'm actually trying to get some folks on my end to look at SLMs for cost reasons in their thinking about features. Right now a lot of it is: "Natural Language? Throw an LLM at it." which makes certain features a lot more expensive than they need to be. Also working with folks to see if we can reduce how much work the model actually has to do before we can pass things off to a more classical algorithm to improve accuracy in some scenarios.

Nycturne · Nov 7, 2024

On a different topic... nano-texture or no nano-texture? Seems like people are liking it on the laptops and it has me second-guessing my pick.

Cmaier · Nov 7, 2024

Nycturne said:
On a different topic... nano-texture or no nano-texture? Seems like people are liking it on the laptops and it has me second-guessing my pick.

everyone seems to like it, and they all downplay the contrast/sharpness downsides, but I think it’s the kind of thing you need to see in person. I’m hoping they stock some on display at Apple Stores so I can see what it looks like.

Nycturne · Nov 7, 2024

Cmaier said:
everyone seems to like it, and they all downplay the contrast/sharpness downsides, but I think it’s the kind of thing you need to see in person. I’m hoping they stock some on display at Apple Stores so I can see what it looks like.

Same. What's interesting is that the spec I picked happens to be on the "secret menu" of in-stock CTO builds. So I could very likely get the nano texture version when I go to pick mine up tomorrow.

theorist9 · Nov 7, 2024

Nycturne said:
On a different topic... nano-texture or no nano-texture? Seems like people are liking it on the laptops and it has me second-guessing my pick.

I've done a direct A-B comparison of the glossy vs. nano-textured ASD at my local Apple store. I don't know if the nano-texure on the laptops is the same, but here are what I found to be the ASD nanotexture's pros and cons:

Pros: It's a very strong AR treatment, and is excellent at reducing reflections. It enabled photos and videos to look great in the brightly-lit Apple store.

Cons: It's a very strong AR treatment, and thus noticeably reduces text sharpness, and creates a strong 'sparkling snowfield' effect on white backgrounds. Though "noticeably" for me may be less so for someone else, since I'm particularly sensitive to text sharpness. OTOH, since the MBP's have a higher pixel density than the ASD's (254 ppi vs. 218 ppi), the sharpness reduction may be even more noticeable on the MBP's (unless Apple has adjusted the treatment on the MBP's to compensate).

tomO2013 · Nov 7, 2024

I’m pretty sure that for me the glossy would be the way to go and buy a matt magnetic paper like protector that can be put on and off the laptop (similar to what I do with my iPad). Best of both worlds.
I do applaud them for offering both options!

mr_roboto · Nov 7, 2024

Nycturne said:
Same. What's interesting is that the spec I picked happens to be on the "secret menu" of in-stock CTO builds. So I could very likely get the nano texture version when I go to pick mine up tomorrow.

Just curious, which spec? When I played around with it I wasn't able to find a CTO M4 Max 16cpu/40gpu config that didn't have weeks of delay. Don't think I was exhaustive though.

dada_dave · Nov 7, 2024

B01L said:
I know y'all are just busting my chops, but I did say all-new...

No need for AGP these days, the GPU is now integrated to the ASi chip...

I would think an all-new Mac Pro Cube would basically be a taller variant on the Mac Studio, to allow for the larger cooling subsystem a Mn Extreme chip would require...

To tide you over until the cube comes out

3D-printed Mac Mini enclosure makes the tiny PC look like the world's cutest Mac Pro

With much cheaper wheels!

www.tomshardware.com

Nycturne · Nov 7, 2024

mr_roboto said:
Just curious, which spec? When I played around with it I wasn't able to find a CTO M4 Max 16cpu/40gpu config that didn't have weeks of delay. Don't think I was exhaustive though.

It’s a couple pages back in the thread, but I’m not getting the Max, which may be part of it:

Nycturne said:
Welp, I went ahead and ordered my upgrade for the next few years. Interestingly, it says I should be able to grab it day one from the Apple Store near my office. Usually have to wait a bit for these CTO builds.

14" MBP with M4 Pro, 48GB, 2TB

Altaic · Nov 7, 2024

Looks like the SSD in the MBP M4 Pro is ~6% faster than the previous generation. I'd hoped they'd have gone for the faster NAND (twice as fast) that's currently on the market, but alas it doesn't look like it.

Apple MacBook Pro 16-inch (M4 Pro, 2024) review: Peak Mac scales to new heights

This much power with this much battery life feels like it may break the laws of physics.

www.laptopmag.com

Also, I'll be receiving a base model Mac mini M4 Pro tomorrow, which I intend to take apart and nondestructively analyze. LMK if there's anything in particular you'd like me to examine (and photograph).

Citysnaps · Nov 7, 2024

Altaic said:
Looks like the SSD in the MBP M4 Pro is ~6% faster than the previous generation. I'd hoped they'd have gone for the faster NAND (twice as fast) that's currently on the market, but alas it doesn't look like it.

Apple MacBook Pro 16-inch (M4 Pro, 2024) review: Peak Mac scales to new heights

This much power with this much battery life feels like it may break the laws of physics.

www.laptopmag.com

Also, I'll be receiving a base model Mac mini M4 Pro tomorrow, which I intend to take apart and nondestructively analyze. LMK if there's anything in particular you'd like me to examine (and photograph).

Looking forward to your assessment. Over the next month I'm thinking of purchasing two M4 Minis. One to handle my security video cameras and home automation software that runs 24/7. And then another one as a dedicated X-Plane flight simulator computer that drives three 4k displays. That task is currently shared with my Mac desktop computer that I use for general stuff and photo editing/processing - and is kind of a pain.

I'm still assessing M4/M4Pro, CPU/GPU cores, RAM, and storage needs, which will be different for the above two uses. For both I'm thinking of going base storage and using a fast Samsung external SSD.

Altaic · Nov 7, 2024

The Mac mini’s SSD uses a connector!

What a beauty this thing is! I’m so eager to get to my own teardown. I’m particularly interested in the board-to-board interconnects and how the layout is split between the two mainboards.

How is the SSD installed - Mac mini (2024)

Is the SSD soldered? - Mac mini (2024)

www.ifixit.com

Altaic · Nov 8, 2024

Altaic said:
The Mac mini’s SSD uses a connector!

What a beauty this thing is! I’m so eager to get to my own teardown. I’m particularly interested in the board-to-board interconnects and how the layout is split between the two mainboards.

How is the SSD installed - Mac mini (2024)

Is the SSD soldered? - Mac mini (2024)

www.ifixit.com

BTW, while I haven’t confirmed it with the person who owns this (perhaps gnattu at MR or kianweelim at iFixit), it appears to be the base M4 model. The NAND chip is marked “128G”, so I’d imagine the other side features the same, for a total of 256GB; the minimum configuration of the M4 Pro is 512 GB, and likely has a different layout. It’s possible that the M4 Pro would have two such SSD cards, which would imply double the throughput (as has been the case with the Mac Studios).

Edit: the source of the images:

Altaic · Nov 8, 2024

Looks like the M4 Max in the 4TB config writes at 7969 MB/s, and the M4 Pro 2TB writes at 7500 MB/s, both significantly faster that previous gens.

Apple MacBook Pro M4 review: the Pro for everyone

Apple’s base model finally feels like like a full-throated Pro.

www.theverge.com

The Flame · Nov 8, 2024

Geekerwan review of M4

M4 Mac Announcements

Site Champ

Site Master

Elite Member

Site Champ

Site Champ

Elite Member

Elite Member

Site Master

Elite Member

Site Champ

Power User

Site Champ

Elite Member

Elite Member

Site Champ

Elite Member

Site Champ

Site Champ

Site Champ

Power User

Similar threads