As I’ve said around here and at the other place, there‘s no particular reason that x86 is more or less suited to hyperthreading than Arm. However, as I’ve also said, when I see hyperthreading in a design, that tells me that there is likely inefficiency in the microarchitecture, and that hyperthreading is a band-aid to get fuller utilization out of the functional units (logic and math units) in the design.
I’ve previously pointed out that Apple’s chips, at least, seem to have a very high utilization for their ALUs. This can be do, in part, to careful choices about the number of pipelines and what each pipeline is capable of doing, and, more importantly, due to very efficient instruction scheduling aided by a very wide instruction issue and very deep look into the instruction stream to determine what instructions are coming next. Wide issue is easier to do on RISC chips because it is far easier to decode the incoming instruction stream (because all instructions are the same length [or at least integer multiples of that length]).
Lately Intel has been going for wider issue, which is possible, but painful, in x86. It requires a lot of resources at the instruction fetch stage, and a lot of speculative buffering and stuff. If you make wrong guesses, you may have to rewind the pipeline. You can compensate for these things with yet more transistors. So the net effect is really that you end up burning more power than a RISC equivalent, but can still get wide issue in x86 (when I say x86, I am always including AMD64 or x86-64 or whatever you want to call it). Intel may have decided that the trade-off makes sense now, especially given that its easy to include lots of transistors now.
A second factor is that some of what was being accomplished by multithreading is much more efficiently accomplished by just adding more real cores. In the 1990’s, the cores took up 80 or 90% of the chip area that wasn’t L2 cache. Now the CPU cores may be a small fraction of the die area. Adding more is relatively cheap, and, all things considered, if you can choose to run your thread on a separate core or run it in a timeslice on a shared core, you would be better off on a separate core most of the time.