> On Thu, 16 Dec 1999, Jamie Lokier wrote:
>
> > asm volatile("sti ; hlt" : : : "memory");
> ^^^^^^^^^
>
> The wakeup can still happen between sti and hlt so it doesn't fix the
> irq race condition.
no. Intel guarantees that the _next_ instruction after a 'sti' is still
executing with interrupts turned off. And now we know why ...
so the correct fix is to check need_resched with IRQs turned off and do
the 'sti ; hlt' sequence.
William, does this fix your problems?
> The trivial way to fix that is to loop on the need_resched in the idle
> task (ugh, not nice :( ).
extremely nasty and definitely the oldest known Linux scheduler bug. i've
noticed more than a year ago that by doing need_resched polling in the
idle loop some of the scheduling irregularities go away, but somehow never
got around noticing the real bug ...
polling on need_resched will also increase performance on SMP systems
btw., because the cacheline snoop event goes to the other CPU much faster
than the APIC message - and we can skip sending the APIC message at all if
the other CPU is running an idle thread. OTOH CPUs will run much hotter
...
-- mingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/