Apple only “major” device maker on 3nm in 2023?

Cmaier

Site Master
Staff Member
Site Donor
Posts
5,326
Reaction score
8,512

This article seems fishy… MediaTek and qualcomm haven’t decided yet whether to ship 3nm in 2023? It takes quite some time to design chips, and you need to know what node you’re designing for. Nobody is waiting until the last minute to make these sorts of decisions. Unless they want to do all the design work and then not ship anything. Companies love paying for all that work for nothing.
 

Colstan

Site Champ
Posts
822
Reaction score
1,124
This is Digitimes, and they have a storied history with shady reporting, making Uncle Gurman look like Svengali. They tend to be good at reporting information about actual production, once it has commenced, but their predictions are questionable, at best.

This brings me back to earlier times, when Digitimes claimed that the Athlon XP would eventually ship with a 400Mhz FSB, when nobody else was reporting such rumors. They weren't getting any real leaks, just guessing based upon AMD's plans. Much like Gurman and Kuo, we tend to remember their successes, while forgetting their failures, always distracted by the next shiny rumor.
 

theorist9

Site Champ
Posts
613
Reaction score
563
I don't know how reliable sammobile.com is, but in this Jan 2 2023 article they say that Qualcomm will be sourcing 3 nm chips from both TSMC (FINFET) and Samsung (GAAFET) ( https://www.sammobile.com/news/samsung-foundry-manufacture-snapdragon-8-gen-3-chips/ ).

Samsung has already shipped its 3 nm chips to Chinese bitcoin miner PanSemi ( https://finbold.com/samsung-to-prod...ps-secures-chinese-asic-firm-as-1st-customer/ ).

They also say Google may be using the Samsung 3 nm process for the Tensor G3 in the upcoming Pixel 8 (https://www.sammobile.com/news/google-tensor-g3-custom-version-exynos-2300-samsung-4nm/).
 

dada_dave

Elite Member
Posts
2,162
Reaction score
2,145

This article seems fishy… MediaTek and qualcomm haven’t decided yet whether to ship 3nm in 2023? It takes quite some time to design chips, and you need to know what node you’re designing for. Nobody is waiting until the last minute to make these sorts of decisions. Unless they want to do all the design work and then not ship anything. Companies love paying for all that work for nothing.
Yeah plus I understand that most of the contracts are worked out in advance too and reportedly TSMC recently included penalties if a customer doesn’t fulfill. It may be that Apple will constitute the majority or totality of 3nm and 3nm+ node space depending on what’s available and what Apple reserved, but it does seem unlikely that these other companies are only making these decisions now.
 

Yoused

up
Posts
5,620
Reaction score
8,937
Location
knee deep in the road apples of the 4 horsemen
I don't know how reliable sammobile.com is, but in this Jan 2 2023 article they say that Qualcomm will be sourcing 3 nm chips from both TSMC (FINFET) and Samsung (GAAFET) ( https://www.sammobile.com/news/samsung-foundry-manufacture-snapdragon-8-gen-3-chips/ ).
This article says that Samsung 3nm has a 10% yield rate while TSMC turns out 80%. Seems FinFlex is easier to accomplish than GAAFET. If Qualcomm plans to use both, that must be a challenge, to adjust the masks between the two.
 

Cmaier

Site Master
Staff Member
Site Donor
Posts
5,326
Reaction score
8,512
This article says that Samsung 3nm has a 10% yield rate while TSMC turns out 80%. Seems FinFlex is easier to accomplish than GAAFET. If Qualcomm plans to use both, that must be a challenge, to adjust the masks between the two.

I’ve lost track of all the marketing names, but if FinFlex is a vertical fin then it’s tons easier than GAAFET.

For GAAFET you need multiple steps for each horizontal fin. I can only guess at the process steps from looking at the structure, but it seems to me you’d need to pattern each individually, deposit oxide between each, etc. You may need two patterning steps per fin, at least.
 

theorist9

Site Champ
Posts
613
Reaction score
563
This article says that Samsung 3nm has a 10% yield rate while TSMC turns out 80%. Seems FinFlex is easier to accomplish than GAAFET. If Qualcomm plans to use both, that must be a challenge, to adjust the masks between the two.
The artice I originally linked also mentioned the difference in yield rates, but said Samsung was able to raise its yield from 20% to 60-70% through a collaboration with Silicon Frontline Technology:

"According to a report from BNext, Samsung Foundry will manufacture some Qualcomm Snapdragon 8 Gen 3 chipsets on its 3nm GAA (GAAFET) node. The majority of the chips will be manufactured by Taiwanese firm TSMC, though, using its 3nm FinFET process. This is reportedly due to TSMC’s higher yield of around 75-80% per wafer. Reports indicate that Samsung’s yield for its 3nm chips is around 60-70%. It is rumored that the South Korean firm’s yield was a measly 20% per wafer before it partnered with the US-based semiconductor firm Silicon Frontline Technology."

If you follow the last link it gets more specific, saying:

"The American firm offers chip qualification evaluation and ESD (Electrostatic Discharge) prevention technology. ESD is one of the leading causes of defects in semiconductor chips, and it is caused by friction between equipment and metal during the manufacturing process. Samsung has reportedly been working with Silicon Frontline for a long time in chip design and production processes and has achieved satisfactory results. The company will now use the firm’s technology in the chip verification process."
 

dysamoria

Member
Posts
10
Reaction score
9
Location
Coplay, The Corporate States of America
Instagram
Main Camera
Canon
I almost wish that we would, as a civilization, just finally reach the absolute maximum shrinkage of chips possible via physics, so that the bulk of the effort goes into optimization and efficiency of engineering, manufacturing, and software... because software sucks and constantly replacing hardware is incredibly wasteful.
 

Yoused

up
Posts
5,620
Reaction score
8,937
Location
knee deep in the road apples of the 4 horsemen
I wonder if FinFlex will give Apple the opportunity to build processors using predominantly 2-1 and 2-2 gates while switching the performance (Max/Ultra) models to predominantly the faster 3-2 gates without having to make significant changes to the floor plan (use basically the same reticle with only small changes in gating patterns).
 

Cmaier

Site Master
Staff Member
Site Donor
Posts
5,326
Reaction score
8,512
I wonder if FinFlex will give Apple the opportunity to build processors using predominantly 2-1 and 2-2 gates while switching the performance (Max/Ultra) models to predominantly the faster 3-2 gates without having to make significant changes to the floor plan (use basically the same reticle with only small changes in gating patterns).
Doesn’t work like that, and I don’t know why TSMC is confusingly marketing it as if you have an entire design on 2-1 and another design on 3-2.

When you design the CPU you use whatever transistor you need in order to achieve your timing/power/IR drop/electromigration goals. A given logic structure will mix and match different kinds of transistors depending on whether they are on a timing-critical path, whether they need to drive long wires, etc.

With MOSFETS, you’d adjust the gate width-to-length ratio to adjust the current-driving capability of the transistor (and adjust the input capacitance as well). You could arbitrarily size the gate polygon (and even play with its shape). With finfets you are stuck with quantized size choices, and the number of fins is a second factor that you can adjust (equivalent to changing the area of a MOSFET gate). [In this paragraph when I say “gate” I mean transistor gate, not logic gate]

It would be a terrible design decision to have an entire design made of 3-2 gates, or an entire design made of 2-1 gates.

At least for high end CPUs.

Don;t know what low-end customers who take TSMC’s cell libraries as they are and don’t try very hard to optimize might do.
 

Yoused

up
Posts
5,620
Reaction score
8,937
Location
knee deep in the road apples of the 4 horsemen
It would be a terrible design decision to have an entire design made of 3-2 gates, or an entire design made of 2-1 gates.

The way I was reading it, on previous nodes, you could use one type of gate across a given region but that N3 gives you the flexibilty to select the type gate-by-gate everywhere. I would imagine that some of the gates that need to be most responsive might be laid out 2-2 in a mobile type device with a bit of extra room to build them 3-2 for a non-mobile processor that can gain from more juice. The tapeouts would be nearly identical but the mobile processor would have gaps in areas where the desktop one would lay in the extra wires and fins for the faster gates.

Of course, you are the expert here, if you say no, then that is not going to be their plan.
 

Cmaier

Site Master
Staff Member
Site Donor
Posts
5,326
Reaction score
8,512
The way I was reading it, on previous nodes, you could use one type of gate across a given region but that N3 gives you the flexibilty to select the type gate-by-gate everywhere. I would imagine that some of the gates that need to be most responsive might be laid out 2-2 in a mobile type device with a bit of extra room to build them 3-2 for a non-mobile processor that can gain from more juice. The tapeouts would be nearly identical but the mobile processor would have gaps in areas where the desktop one would lay in the extra wires and fins for the faster gates.

Of course, you are the expert here, if you say no, then that is not going to be their plan.

You can’t just swap to lower power gates and expect anything to work. Consider a typical situation:

1673831471743.png


We have two flip-flops with a bunch of logic in between them. A bunch of different paths are shown. If you look at gate [A], it has to drive the second flip flop, but also a bunch of other gates. Each of those gates looks to [A] as if it is a capacitor.

(B) only drives a single gate, but there are a bunch more gates in sequence after that and you need to propagate the logic all the way through them to the input of the flip flop by the time the clock toggles again.

So [A] and (B) each are constrained in different ways. If you decide to make this a lap top part by reducing fin count, you reduce the ability of [A] to create a ramp fast enough to get the job done:

1673831771420.png


The red line is the last-arriving input to A. Some time after it begins to switch, the output of A may switch. The black line indicates how it might switch if the transistors in A have the appropriate number of fins for the load capacitance it needs to drive. The blue line is what happens if it does NOT have enough drive strength (because fins are two small or there aren’t enough of them).

But it gets worse. When the drive strength is too weak, your signal become susceptible to noise caused by coupling to neighboring wires. So what you REALLY get is something like this:

1673832068333.png


The red shows how downstream gates would interpret their inputs - this would cause downstream gates to flip back and forth, burning lots of needless power, and likely cause them not to reach their final values in time to be captured by the flip flop when the clock toggles.

Well, you might say, if you use fewer fins, the load capacitance seen by the logic gates reduces. True. But the *wire* capacitances don’t. The wires don’t scale just because you changed to a transistor with fewer fins. You still have to drive the wire’s parasitic capacitance, and you still have to get the signal to the next gate in time even though the RC network formed by the wire connections hasn’t changed (it acts both as a transmission line so you need to worry about time-of-flight, and as a capacitive load).

In short, you have to do all the physical design all over again, you will likely move cells around, add repeaters, re-design logic, etc.

At exponential we had a tool that could supposedly reduce gate sizes on non-critical paths to save power. The end result was a huge loss in performance, because they didn’t think the coupling between wires would cause a big deal - they were wrong.
 

Cmaier

Site Master
Staff Member
Site Donor
Posts
5,326
Reaction score
8,512
I have nothing insightful, thought-provoking, or in any way useful to add. I just wanted to say that I appreciate how @Cmaier illustrates, presents and explains highly complex subjects using what are essentially napkin doodles.
LOL. A surprising amount of my design work amounted to, essentially, napkin doodles.
 

dada_dave

Elite Member
Posts
2,162
Reaction score
2,145
There is a scurrilous leak-rumor of an A-series processor on N3 having been tested, clocking a SC score at around 130% of Raptor Lake, MC being in the 40% range – if it is typical 4E+2P, that would be pretty darn competitive.

Yeah I saw that … I suppose theoretically possible, but that’d be an … incredibly incredible increase in performance for a single generation. It’d be amazing.
 

theorist9

Site Champ
Posts
613
Reaction score
563
There is a scurrilous leak-rumor of an A-series processor on N3 having been tested, clocking a SC score at around 130% of Raptor Lake, MC being in the 40% range – if it is typical 4E+2P, that would be pretty darn competitive.
That comes from a MaxTech video. I watched it (on double speed, so I wouldn't have to waste too much time :D ). MaxTech's source claimed Apple got that score by experimenting with high clocks, resulting in too much heat to be usable in an iPhone (but which would perhaps be OK for a Mac). MaxTech claims the source is credible but, of course, who knows....

Yeah I saw that … I suppose theoretically possible, but that’d be an … incredibly incredible increase in performance for a single generation. It’d be amazing.
For SC, it's a 60% increase over the A16, which I guess you could get from a combination of process improvement, architecture (IPC) improvement, and higher clock speeds.
 

dada_dave

Elite Member
Posts
2,162
Reaction score
2,145
That comes from a MaxTech video. I watched it (on double speed, so I wouldn't have to waste too much time :D ). MaxTech's source claimed Apple got that score by experimenting with high clocks, resulting in too much heat to be usable in an iPhone (but which would perhaps be OK for a Mac). MaxTech claims the source is credible but, of course, who knows....


For SC, it's a 60% increase over the A16, which I guess you could get from a combination of process improvement, architecture (IPC) improvement, and higher clock speeds.
Sure you can get that … but that is also unlikely to happen (even for Mac level power/cooling). Who knows? maybe Apple will pull a rabbit out of their hat, but that bunny would be one of these:


Okay maybe a Mac Pro but even then

N3 is 15% performance increase at the same power, to make a 60% overall ST improvement reasonable with a reasonable “oh darn it’s too hot for an iPhone” power draw, the IPC gains would have to be huge. Possible? Yes. Likely? No.

I mean I am hoping for and expecting big gains - Apple is probably going to cramming quite a few ideas they’ve been working on while N3 was delayed, but that would still be amazing and way outside Apple’s performance per generation cadence which has been pretty steady. I’m expecting it to be on the high end given the circumstances, but that’s way beyond even my bullish expectations.
 
Last edited:

Cmaier

Site Master
Staff Member
Site Donor
Posts
5,326
Reaction score
8,512
Sure you can get that … but that is also unlikely to happen (even for Mac level power/cooling). Who knows? maybe Apple will pull a rabbit out of their hat, but that bunny would be one of these:


Okay maybe a Mac Pro but even then

N3 is 15% performance increase at the same power, to make a 60% overall ST improvement reasonable with a reasonable “oh darn it’s too hot for an iPhone” power draw, the IPC gains would have to be huge. Possible? Yes. Likely? No.

I mean I am hoping for and expecting big gains - Apple is probably going to cramming quite a few ideas they’ve been working on while N3 was delayed, but that would still be amazing and way outside Apple’s performance per generation cadence which has been pretty steady.

When they say “N3 is 15% performance increase at the same power” what is that even supposed to mean, though? On N3 I redesign my circuit completely, to take advantage of the N3 design rules. Are they taking that into account? And, if so, what kind of circuit are they talking about?

Until I entered the outside world, I never heard of such things when comparing nodes. When I was designing CPUs, the only metrics we cared about were “x% reduction in minimum spacing on poly, y% on M1…, x% pitch reduction on layer __, wire heights decrease by z…”. We determined how much faster the next CPU would be, not the fab. You can’t predict these things from just one or two data points.
 
Top Bottom
1 2