Over at the other place the general consensus is that Apple silicon has stagnated. It is quite bizarre. People seem to have already forgotten that the M1 Ultra came out in March. Unreasonable expectations are a thing I guess.
Just a little more color on this - I just had a flashback. When i first went to AMD after my brief stint at Sun, I was working for Nexgen team in Milpitas - they had been bought by AMD but very recently, and hadn’t moved to AMD headquarters. I was assigned to take over the ALUs for a K6 variant we were doing. They had a problem with the fab, so the expected node improvements weren’t going to be there, so they apparently promised the market we’d hit our speed goal anyway somehow. As is the case in most CPUs, the clock frequency is pretty much constrained by the speed of the ALU. In particular, we had to be able to do, essentially, a three-input 32-bit addition within one clock cycle (along with various multiplexor delays in the path, potentially inverting one of the inputs, etc.).
I had just started and was asked to evaluate what I could do. The design was essentially a giant text file of spaghetti code, listing logic gates and their connections. The only “documentation“ was hand scrawled schematics of bits and pieces of it written in my predecessor’s notebooks. I spent a week or so trying to make heads or tails of it, and couldn’t understand any of the logic, though my predecessor gave me some vague suggestions about how it could be made faster.
My manager (one of two who shared responsibility, and who I really didn’t know) comes to me and asks “ok. you’ve looked at it. We need to speed it up by 10%. Can you do that?”
I paused for just a second and said “well, I don’t have any idea how, but even if I have to start from scratch and redo the design, I will get you your 10%.”
He smiled and said - and I will never forget this because it was so weird - “you are exactly my idea of the perfect microprocessor engineer.”
The punchline: I have no recollection at all as to whether I got it done or not, but since I kept my job I assume so
. I probably did something terrible like clock borrowing across latch boundaries to get the last 1% done (I recall we used latches instead of flip flops on that project - the last project we ever did that on, I think).
But my first 4 or 5 years at AMD (and my time at Exponential) were largely spent on that sort of optimization; given a chip on a process node, find a way to make it 10-20% better without changing process node. Each time we did that we’d also make small microarchitecture optimizations - support for different memory standards, a new instruction here or there, make the multiplier take 4 cycles instead of 5, etc.