M1 reveals ancient Linux bug

What does this exactly mean that the M1 is ridiculously out-of-order?
It’s capable of issuing tons of instructions at once, and issuing them out of order so that the ALUs are always busy with something to do. It can do this because it can look deep into the instruction stream and find instructions that do not depend on results from other instructions, and can issue them while there are bubbles in the pipeline (caused by things like branch mispredictions, cache misses, etc)
 
What does this exactly mean that the M1 is ridiculously out-of-order?
A program is written as
Do this
Do that
Do the other thing
Do more stuff​
Some of those things can be done simultaneously – if this, for instance, takes longer to do and the reorder buffer can finish that and more stuff while this is being worked on, it will, in an effort to get as much stuff done in as short a time as possible. All of this is carefully tagged and managed so that data flow carries forward properly.

On an efficiency core, the reorder buffer can hold over 200 instructions, a performance core about three times that (and the buffers can retain looped code so that the core only has to fetch it once). There is a large variety of "barrier" instructions so that code can selectively enforce instruction and memory-access ordering in order to maje sure things make sense, but those instructions are used sparingly, to allow the core to work at its best.
 
What does this exactly mean that the M1 is ridiculously out-of-order?

It just means that 2 instructions {I0, I1}, if it can detect at runtime that

- I0 does not depend on the result of I1
- I1 does not depend on the result of I0

then it doesn't matter which executes first, assuming that like @Yoused mentioned, they aren't separated by a barrier.
 
Back
Top