Re: page fault scalability patch V11 [0/7]: overview

From: Nick Piggin
Date: Sat Nov 20 2004 - 02:31:24 EST


William Lee Irwin III wrote:
William Lee Irwin III wrote:

touch_nmi_watchdog() is only "protection" against local interrupt
disablement triggering the NMI oopser because alert_counter[]
increments are not atomic. Yet even supposing they were made so, the


On Sat, Nov 20, 2004 at 05:49:53PM +1100, Nick Piggin wrote:

That would be a bug in touch_nmi_watchdog then, because you're
racy against your own NMI too.
So I'm actually not very very wrong at all. I'm technically wrong
because touch_nmi_watchdog has a theoretical 'bug'. In practice,
multiple races with the non atomic increments to the same counter,
and in an unbroken sequence would be about as likely as hardware
failure.
Anyway, this touch nmi thing is going off topic, sorry list.


No, it's on-topic.
(1) The issue is not theoretical. e.g. sysrq t does trigger NMI oopses,
merely not every time, and not on every system. It is not
associated with hardware failure. It is, however, tolerable
because sysrq's require privilege to trigger and are primarly
used when the box is dying anyway.

OK then put a touch_nmi_watchdog in there if you must.

(2) NMI's don't nest. There is no possibility of NMI's racing against
themselves while the data is per-cpu.


Your point was that touch_nmi_watchdog() which resets alert_counter,
is racy when resetting the counter of other CPUs. Yes it is racy.
It is also racy against the NMI on the _current_ CPU.

This has nothing whatsoever to do with NMIs racing against themselves,
I don't know how you got that idea when you were the one to bring up
this race anyway.

[ snip back-and-forth that is going nowhere ]

I'll bow out of the argument here. I grant you raise valid concens
WRT the /proc issues, of course.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/