The biggest difference is the underlying architecture. With a RISC design like ARMv8 or PPC, the compiler backend can be constructed to arrange the instruction stream to optimize dispatch, so that multiple instructions can happen at the same time due to lack of interdependencies. And a multicycle op like a multiplication or memory access can be pushed ahead in the stream in a way that allows other ops to flow around it, getting stuff done in parallel.
With x86, the smaller register file, along with the arrangement of the ISA, is an impediment to doing this. It may very well have been feasible for Intel to improve performance by reordering μops to gain more parallelism, but it appears to be really hard to accomplish – otherwise I imagine they would have done it. Keeping instruction flow in strict order is safer and easier for them, so they went with HT instead.
There is a functional similarity between RISC ops and x86 μops, but there is one key difference: the compiler cannot produce them the way a RISC compiler can. Being able to lay the pieces of the program out for the dispatcher to make best use of is an advantage you cannot make up for by breaking down complex instructions. When an M1 does the same work as an i9 at a quarter of max power draw, that is efficiency that is profoundly hard to argue with.
And, curiously, Apple is not using "turbo". I am not sure why that is, but it seems like they like to spread heavy workloads across the SoC, so I am guessing they must feel like "turbo" is just silly hype. Maybe "turbo" is only an advantage for long skinny pipes, not so much the wide ones. Or maybe I am wrong and Apple will roll out some similar term in the near future (though I would be a bit disappointed if they did).