Intel Raptor Lake

Are there ways other than clock speed that the i9-13900KS could improve in ST performance over the i9-13900K? Based on the diff. in clock speed alone (6 GHz vs. 5.8 GHz), the i9-13900KS should be only 3% faster, with an increase in CB ST from 2.2k to 2.3k. But they're claiming 5% and 2.4k, respectively.

Bigger caches?
 
Bigger caches?
The i9-13900KS would need to have bigger L1 and/or L2 caches than the i9-13900K to have an effect on ST performance, right? [I assume the L3 caches, being shared, should be large enough not to create limitations even on the i9-13900K during ST tests.] [I'm further assuming it's still the case that L1 and L2 are per-core, while L3 is shared: https://cvw.cac.cornell.edu/codeopt/multicore ]
 
Last edited:
The i9-13900KS would need to have bigger L1 and/or L2 caches than the i9-13900K to have an effect on ST performance, right? [I assume the L3 caches, being shared, should be large enough not to create limitations even on the i9-13900K during ST tests.] [I'm further assuming it's still the case that L1 and L2 are per-core, while L3 is shared: https://cvw.cac.cornell.edu/codeopt/multicore ]

Yeah, bigger L1/L2 would improve ST performance (assuming the test is big enough to stress the cache).
 
Are there ways other than clock speed that the i9-13900KS could improve in ST performance over the i9-13900K? Based on the diff. in clock speed alone (6 GHz vs. 5.8 GHz), the i9-13900KS should be only 3% faster, with an increase in CB ST from 2.2k to 2.3k. But they're claiming 5% and 2.4k, respectively.

Could also be clock stability. For example the i9-13900KS could maybe reach and actually maintain 6Ghz for a non-trivial amount of time, while maybe the i9-13900K can only reach 5.8 for brief periods of time?
 
Could also be clock stability. For example the i9-13900KS could maybe reach and actually maintain 6Ghz for a non-trivial amount of time, while maybe the i9-13900K can only reach 5.8 for brief periods of time?
Interesting, but given how easy it is to check clock speed (with Intel's own Power Gadget), I'd be surprised that Intel would market a processor as having a higher SC turbo speed than it can maintain (given adequate power and cooling). Seems like doing so woud create a marketing nightmare for them. Has it ever been the case that an Intel processor couldn't maintain its rated SC turbo clock speed when only one core was being used (not including background system tasks)?
 
They way I understand it, the GB test is fairly short, so, does the SC test run at base clock or "turbo"? The M2 shows a SC/GHz of around 540 while the i9-13900K shows around 740, if it is running at base clock (~380 if at 5.8GHz).

Also, the MC score is about 46% of 24 x SC, which is a bit on the low side (35% if that is running 32 threads). M2 manages around 57%.
 
Also, the MC score is about 46% of 24 x SC, which is a bit on the low side (35% if that is running 32 threads). M2 manages around 57%.

Well of course, 16 out of those 24 cores are area-optimised cores which are considerably slower than the P-cores. Looking at SC/MC ratios of hybrid-core CPUs is probably less useful. Especially in case of Apple, where the E-core is maybe only 20% of the performance of a P-core.

As you mention, GB tests are fairly short-running, so the CPU is unlikely to have reached its sustained operation mode. Probably pulling 150 watts or more to produce that result.
 
Not sure what’s up with Intel reusing (?) Alder Lake cores for some of the 13rd gen processors (Raptor Lake).


We are leading with the Raptor Lake architecture on a refined Intel 7 process featuring the Raptor Cove core for higher clocks, more E-cores, larger caches and more, but we have leveraged Alder Lake die to provide an efficient and effective way to ramp the new product stack for a broad market while giving us more capacity for the leading-edge technology at the top of our stack. It is important to know that when we leverage the ADL die it has been qualified and validated so the end user is getting the exact same spec (i.e. frequency, cache) at the same processor level – regardless of which die has been used."
 
Not sure what’s up with Intel reusing (?) Alder Lake cores for some of the 13rd gen processors (Raptor Lake).

It was an efficient and effective way to ramp the new product stack, is what’s up :-)
 
Do companies ever burn prototypes on the same wafer alongside production chips?

No. Production wafer space is too valuable. Each reticle is identical, so you can;t just sneak one or two of some other thing on there. You do put various test structures on there (which also repeat in each reticle) that you can use for things like debug, characterizing the process, Etc. But they are tiny. Ring oscillators, VCOs, that sort of thing.
 
Do they use one chip mask and iterate, or is it a single compound mask (set) with all the chips on it? Or do methods vary?
 
Do they use one chip mask and iterate, or is it a single compound mask (set) with all the chips on it? Or do methods vary?

In the old days, you had a mask for the entire wafer and patterned the whole wafer at once. The term “reticle” was used as an alternative to “mask” - the idea is that reticle is a small rectangle that you then repeat, while mask is the whole wafer. The terms eventually got muddied up.

Anyway, in modern times, your reticle is much smaller than the wafer, so you pattern one part of the wafer at a time, and step along. The reticle can have multiple chips in it, of course. Depends on the size of the chips.

The last time i saw a wafer-size mask set was for 5” GaAs wafers, and I manually aligned the masks myself in my university clean room :)

One guy, Pete, was using a high speed turntable to spread spin-on photoresist across a wafer, but he didn’t center it properly, and it spun off and shattered. He got a shard embedded in his hand, right through the glove. Good times.
 
But given that most modern chips have multiple layers of metal wires, the reticle must be more than just one simple mask, right?
 
But given that most modern chips have multiple layers of metal wires, the reticle must be more than just one simple mask, right?

Right. The reticle is a set of masks, applied one after the other to the same spot on the wafer. (more accurately, you apply a reticle to one location, then move it to the next, until you cover the wafer. Then you start from the original spot with the next reticle layer).

Some steps (e.g. lithography) happen on a reticle-by-reticle basis, while other steps (e.g. metal deposition, etching, photoresist deposition) happen on the wafer level.
 
But given that most modern chips have multiple layers of metal wires, the reticle must be more than just one simple mask, right?
By the way, you mention metal, but these fancy transistors with wrap-around gates require a bunch of mask steps as well. A ton of them, in fact. That, rather than small size per se, is what makes it so hard. Take some transparency film and draw stuff on a bunch of sheets, then try to line them all up perfectly. Now try it again, but do it on top of a frisbee instead of a flat desk. Now draw really really small stuff and try to line it up.

The “frisbee” problem may be less of a problem nowadays because, in theory, within a reticle area there may not be meaningful curvature, so all you need to do is refocus each reticle. Back when the reticle could cover a large portion of the wafer, we had a hard time making the chips at the edge of the wafer work well.
 
Do companies ever burn prototypes on the same wafer alongside production chips?
TSMC does something almost like this. They offer reduced cost prototyping with their "shuttle" service. A shuttle run groups several TSMC customers who all want to build a small run of engineering prototypes, and builds them all on one wafer. Sharing the fixed wafer start costs dramatically reduces cost per device for each customer, which makes prototyping a lot more practical. There are downsides of course. This isn't mass production, you won't be able to order a large number of devices even if you think you could use them for testing. I expect the contract terms are also much riskier (for customers, not TSMC) than production contracts.

Unlike your idea, though, TSMC's shuttle service never places prototype and production devices on the same wafer. In fact, TSMC has different QoS for shuttle and production wafers. A wafer's trip through a fab involves stops at potentially hundreds of different machines. Shuttle wafers get scheduling priority so they can move through the fab faster (reduces calendar latency from wafer start to packaged device, which is important for prototyping), while production wafer scheduling is optimized for throughput instead.
 
TSMC does something almost like this. They offer reduced cost prototyping with their "shuttle" service. A shuttle run groups several TSMC customers who all want to build a small run of engineering prototypes, and builds them all on one wafer. Sharing the fixed wafer start costs dramatically reduces cost per device for each customer, which makes prototyping a lot more practical. There are downsides of course. This isn't mass production, you won't be able to order a large number of devices even if you think you could use them for testing. I expect the contract terms are also much riskier (for customers, not TSMC) than production contracts.

Unlike your idea, though, TSMC's shuttle service never places prototype and production devices on the same wafer. In fact, TSMC has different QoS for shuttle and production wafers. A wafer's trip through a fab involves stops at potentially hundreds of different machines. Shuttle wafers get scheduling priority so they can move through the fab faster (reduces calendar latency from wafer start to packaged device, which is important for prototyping), while production wafer scheduling is optimized for throughput instead.

I’m curious…do you happen to know if TSMC shuttle service employs e-beam lithography, or, are individual prototype projects combined on a single mask set?
 
I’m curious…do you happen to know if TSMC shuttle service employs e-beam lithography, or, are individual prototype projects combined on a single mask set?
For these sorts of things the fab always assembles a reticle by combining sub-reticle masks. As a grad student I used to get discounted fab access by keeping my projects small enough to fit in odd spaces the fab told us were available. I assume it works the same way with tsmc. (Minus the discount)
 
Back
Top