William Lee Irwin III wrote:
touch_nmi_watchdog() is only "protection" against local interrupt
disablement triggering the NMI oopser because alert_counter[]
increments are not atomic. Yet even supposing they were made so, the
On Sat, Nov 20, 2004 at 05:49:53PM +1100, Nick Piggin wrote:
That would be a bug in touch_nmi_watchdog then, because you're
racy against your own NMI too.
So I'm actually not very very wrong at all. I'm technically wrong
because touch_nmi_watchdog has a theoretical 'bug'. In practice,
multiple races with the non atomic increments to the same counter,
and in an unbroken sequence would be about as likely as hardware
failure.
Anyway, this touch nmi thing is going off topic, sorry list.
No, it's on-topic.
(1) The issue is not theoretical. e.g. sysrq t does trigger NMI oopses,
merely not every time, and not on every system. It is not
associated with hardware failure. It is, however, tolerable
because sysrq's require privilege to trigger and are primarly
used when the box is dying anyway.
(2) NMI's don't nest. There is no possibility of NMI's racing against
themselves while the data is per-cpu.