Having said that, the bigger issue is that you don't want to think of hyperthreaded threads that way to start with, because hyperthreading doesn't allow a single core to do more than one process at once. It simply queues up the threads within a core for faster thread switching, so there is less idle time. [I believe it does that by "exposing two logical execution contexts per core"*, where each context has its own thread. Thus when one context would be waiting for more input, it can immediately switch to the other context. But only one context can run at once.]
What you're describing is a type of hardware multithreading support sometimes called Switch on Event Multi-Threading, or SoEMT. Usually the event causing a context switch in a SoEMT core is a memory stall - rather than waiting around for memory to come back with results, switch to another thread to keep the core busy.
However, Intel "hyperthreading" is true simultaneous multithreading (SMT) - instructions from both hardware threads coexist and make forward progress in the core's execution units at the same time. There is no context switch.
The purpose of SMT is not fast thread switching. It's basically a trick to extract more throughput from an out-of-order superscalar CPU core.
To understand how this works, consider a hypothetical OoO cpu. It has U execution units, each with P pipeline stages, so the core can have N = U * P instructions in progress at the same time. To maximize the number of instructions completed per cycle, and hence the total performance of the core, ideally you want all N of these execution slots occupied by an instruction in every cycle.
That turns out to be hard to accomplish. Say the execution units consist of two integer, one load/store, and one FP. Assume there's a front end capable of decoding and dispatching four instructions per cycle. If the running program doesn't stick to a rigid pattern of exactly 2 int, 1 L/S, and 1 FP instruction in each group of 4 instructions, there's simply no way to keep all the execution units busy. The core will have to issue (and therefore retire) less than 4 IPC.
When you measure this in the real world, it's rare for CPUs to run anywhere close to their theoretical maximum IPC. The usual reason cores end up in that place is that some programs benefit from having lots of a particular kind of execution unit while not using the others, so you end up sizing things for the peak requirements of each type of program you care about, but this inevitably leads to lots of wasted resources when running something else.
That's how SMT was born. You just try to fill the empty slots with instructions from one or more additional threads. In very rare cases you might see as much as a doubling of throughput, but the average won't be nearly that good - the threads are competing with each other for all the core's resources, including cache and physical registers. However, you typically do see more throughput than a single thread running in the same set of execution units.