Re: page fault scalability patch V11 [0/7]: overview

From: Nick Piggin
Date: Sat Nov 20 2004 - 01:52:10 EST


William Lee Irwin III wrote:
William Lee Irwin III wrote:

There isn't anything left to explain. So if there's a question, be
specific about it.


On Sat, Nov 20, 2004 at 04:50:25PM +1100, Nick Piggin wrote:

Why am I very very wrong? Why won't touch_nmi_watchdog work from
the read loop?
And let's just be nice and try not to jump at the chance to point
out when people are very very wrong, and keep count of the times
they have been very very wrong. I'm trying to be constructive.


touch_nmi_watchdog() is only "protection" against local interrupt
disablement triggering the NMI oopser because alert_counter[]
increments are not atomic. Yet even supposing they were made so, the

That would be a bug in touch_nmi_watchdog then, because you're
racy against your own NMI too.

So I'm actually not very very wrong at all. I'm technically wrong
because touch_nmi_watchdog has a theoretical 'bug'. In practice,
multiple races with the non atomic increments to the same counter,
and in an unbroken sequence would be about as likely as hardware
failure.

Anyway, this touch nmi thing is going off topic, sorry list.

net effect of "covering up" this gross deficiency is making the
user-observable problems it causes undiagnosable, as noted before.


Well the loops that are in there now aren't covered up, and they
don't seem to be causing problems. Ergo there is no problem (we're
being _practical_ here, right?)


William Lee Irwin III wrote:

This entire line of argument is bogus. A preexisting bug of a similar
nature is not grounds for deliberately introducing any bug.


On Sat, Nov 20, 2004 at 04:50:25PM +1100, Nick Piggin wrote:

Sure, if that is a bug and someone is just about to fix it then
yes you're right, we shouldn't introduce this. I didn't realise
it was a bug. Sounds like it would be causing you lots of problems
though - have you looked at how to fix it?


Kevin Marin was the first to report this issue to lkml. I had seen
instances of it in internal corporate bugreports and it was one of
the motivators for the work I did on pidhashing (one of the causes
of the timeouts was worst cases in pid allocation). Manfred Spraul
and myself wrote patches attempting to reduce read-side hold time
in /proc/ algorithms, Ingo Molnar wrote patches to hierarchically
subdivide the /proc/ iterations, and Dipankar Sarma and Maneesh
Soni wrote patches to carry out the long iterations in /proc/ locklessly.

The last several of these affecting /proc/ have not gained acceptance,
though the work has not been halted in any sense, as this problem
recurs quite regularly. A considerable amount of sustained effort has
gone toward mitigating and resolving rwlock starvation.


That's very nice. But there is no problem _now_, is there?

Aggravating the rwlock starvation destabilizes, not pessimizes,
and performance is secondary to stability.


Well luckily we're not going to be aggravating the rwlock stavation.

If you found a problem with, and fixed do_task_stat: ?time, ???_flt,
et al, then you would apply the same solution to per thread rss to
fix it in the same way.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/