Re: 2.4.20: Proccess stuck in __lock_page ...

From: Con Kolivas (kernel@kolivas.org)
Date: Wed May 28 2003 - 00:36:49 EST


On Wed, 28 May 2003 04:04, Marc-Christian Petersen wrote:
> On Tuesday 27 May 2003 19:50, manish wrote:
>
> Hi Manish,
>
> > It is not a system hang but the processes hang showing the same stack
> > trace. This is certainly not a pause since the bonnie processes that
> > were hung (or deadlocked) never completed after several hrs. The stack
> > trace was the same.
>
> then you are hitting a different bug or a bug related to the issues
> Christian Klose and me and $tons of others were complaining.
>
> The bug you are hitting might be the problem with "process stuck in D
> state" Andrea Arcangeli fixed, let me guess, over half a year ago or so.
>
> In case you have a good mind to try to address your issue, you might want
> to try out the patch you can find here:
>
> http://www.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.21rc2
>aa1/9980_fix-pausing-2
>
> ALL: Anyone who has this kind of pauses/stops/mouse is dead/keyboard is
> dead/: speak _NOW_ please, doesn't matter who you are!

Yo!

I'll throw my babushka into the ring too. I think it's obvious from MCP's
comments that I've been involved in testing this problem. I've spent hours,
possibly days trying to find a way to fix the pauses introduced since
2.4.19pre1. I agree with what MCP describes that the machine can come to a
standstill under any sort of disk i/o and is unusable for a variable length
of time. I've been playing with all sorts of numbers in my patchset to try
and limit it with only mild success. The best results I've had without a
major decrease in throughput was using akpm's read latency 2 patch but by
significantly reducing the nr_requests. It was changing the number of
requests that I discovered dropping them to 4 fixed the problem but destroyed
write throughput. I was pleased to see AA give the problem recognition after
my contest results on his kernel but disappointed that the problem only was
reduced, not fixed.

I have seen it on every piece of hardware I have used a 2.4.19+ kernel on
using the desktop. I have no idea what the real problem is, but I firmly
believe with MCP that it is the biggest flaw in 2.4 on the desktop (no idea
what it does to servers). We've tried over and over again fiddling with the
numbers and patches and only going to less than 2.4.19 fixes it completely.

Con
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/