Welcome!I’m just another tech nerd speculator here, no special or professional insight:
1 The bottleneck for large language model token generation Is memory bandwidth. for this part of inference both Ultras are about the same speed and nearly double the speed of each Max. The M1 generation at least, appears to not be using it theoretical maximum bandwidth, later generations are closer.
The other important factor for running the largest most useful models is memory capacity, where M2 Ultras also crush all Max variations.
I would hope the datacenter team would arrange to have the M2 Ultras fully populated with faster RAM than the consumer Ultras.
2. if these rumors are true, it would make sense for the M4 Max and Ultra to be available for consumers later. serving their user base with datacenter AI is going to require many hundreds of thousands of Ultras. They might need two to three quarters of M4 Ultra production before they can spare to sell them.
According to Semianalysis:
”The other indication that Cupertino is serious about their AI hardware and infrastructure strategy is they made a number of major hires a few months ago. This includes Sumit Gupta who joined to lead cloud infrastructure at Apple in March. He’s an impressive hire. He was at Nvidia from 2007 to 2015, and involved in the beginning of Nvidia's foray into accelerated computing. After working on AI at IBM, he then joined Google’s AI infrastructure team in 2021 and eventually was the product manager for all Google infrastructure including the Google TPU and Arm based datacenter CPUs.
He’s been heavily involved in AI hardware at Nvidia and Google who are both the best in the business and are the only companies that are deploying AI infrastructure at scale today. This is the perfect hire.”
Hopefully he is discussing with the chip team about optimizing Apple Silicon capabilities in this arena.
I noted Sumit Gupta in another thread here that you may find interesting. I’m also hoping for significantly increased memory bandwidth as well as a TPU-like accelerator for Mac Pros. Either way, WWDC should be exciting.