Re: page fault scalability patch V11 [0/7]: overview

From: William Lee Irwin III
Date: Fri Nov 19 2004 - 23:34:30 EST


William Lee Irwin III wrote:
>> /proc/ triggering NMI oopses was a persistent problem even before that
>> code was merged. I've not bothered testing it as it at best aggravates it.

On Sat, Nov 20, 2004 at 03:03:17PM +1100, Nick Piggin wrote:
> It isn't a problem. If it ever became a problem then we can just
> touch the nmi oopser in the loop.

Very, very wrong. The tasklist scans hold the read side of the lock
and aren't even what's running with interrupts off. The contenders
on the write side are what the NMI oopser oopses.

And supposing the arch reenables interrupts in the write side's
spinloop, you just get a box that silently goes out of service for
extended periods of time, breaking cluster membership and more. The
NMI oopser is just the report of the problem, not the problem itself.
It's not a false report. The box is dead for > 5s at a time.


William Lee Irwin III wrote:
>> And thread groups can share mm's. do_for_each_thread() won't suffice.

On Sat, Nov 20, 2004 at 03:03:17PM +1100, Nick Piggin wrote:
> I think it will be just fine.

And that makes it wrong on both counts. The above fails any time
LD_ASSUME_KERNEL=2.4 is used, we well as when actual Linux features
are used directly.


-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/