Re: 2.4.20: Proccess stuck in __lock_page ...

From: Con Kolivas (kernel@kolivas.org)
Date: Wed May 28 2003 - 02:18:47 EST


On Wed, 28 May 2003 16:04, Jens Axboe wrote:
> On Wed, May 28 2003, Con Kolivas wrote:
> > On Wed, 28 May 2003 04:04, Marc-Christian Petersen wrote:
> > > On Tuesday 27 May 2003 19:50, manish wrote:
> > >
> > > Hi Manish,
> > >
> > > > It is not a system hang but the processes hang showing the same stack
> > > > trace. This is certainly not a pause since the bonnie processes that
> > > > were hung (or deadlocked) never completed after several hrs. The
> > > > stack trace was the same.
> > >
> > > then you are hitting a different bug or a bug related to the issues
> > > Christian Klose and me and $tons of others were complaining.
> > >
> > > The bug you are hitting might be the problem with "process stuck in D
> > > state" Andrea Arcangeli fixed, let me guess, over half a year ago or
> > > so.
> > >
> > > In case you have a good mind to try to address your issue, you might
> > > want to try out the patch you can find here:
> > >
> > > http://www.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.2
> > >1rc2 aa1/9980_fix-pausing-2
> > >
> > > ALL: Anyone who has this kind of pauses/stops/mouse is dead/keyboard is
> > > dead/: speak _NOW_ please, doesn't matter who you are!
> >
> > Yo!
> >
> > I'll throw my babushka into the ring too. I think it's obvious from MCP's
> > comments that I've been involved in testing this problem. I've spent
> > hours, possibly days trying to find a way to fix the pauses introduced
> > since 2.4.19pre1. I agree with what MCP describes that the machine can
> > come to a standstill under any sort of disk i/o and is unusable for a
> > variable length of time. I've been playing with all sorts of numbers in
> > my patchset to try and limit it with only mild success. The best results
> > I've had without a major decrease in throughput was using akpm's read
> > latency 2 patch but by significantly reducing the nr_requests. It was
> > changing the number of requests that I discovered dropping them to 4
> > fixed the problem but destroyed write throughput. I was pleased to see AA
> > give the problem recognition after my contest results on his kernel but
> > disappointed that the problem only was reduced, not fixed.
>
> Does the problem change at all if you force batch_requests to 0?

I've tried batch_requests to 1 by itself (without changing the nr_request) and
that didn't fix it, but recall dropping nr_requests to 2 (which would make
batch requests==0) made the machine fail to boot so I haven't tried batch
requests 0 by itself. Should it boot with it == 0?

Con
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/