Re: 2.4 mm trouble [possible lru race]

From: Daniel Phillips (phillips@arcor.de)
Date: Tue Oct 01 2002 - 05:26:42 EST


The theoretical lru race possibly spotted in the wild...

On Tuesday 01 October 2002 11:22, Richard Zidlicky wrote:
> On Sat, Sep 28, 2002 at 04:38:50PM +0200, Roman Zippel wrote:
> > Hi,
> >
> > On Wed, 25 Sep 2002, Richard Zidlicky wrote:
> >
> > > First I suspected a stale TLB entry so I've added pretty many
> > > extra flushes througout the code. This does very much reduce the
> > > risk of the problem, but the problem still happens if swapping is
> > > increased so it might very well be something else.
> > >
> > > Any ideas?
> >
> > It sounds like a cache problem. I had to fix one early 2.4, so maybe there
> > is another one.
> > It would help a lot if you could reproduce it within gdb to get some more
> > information about the context of the crash (e.g. invalid data or code, the
> > type of mapping (from /proc/<pid>/maps)).
>
> hm, I am now testing the appended patch, which is a backport to 2.4.19
> of this:
>
<<<<<<<<<<<<
>
> From: Daniel Phillips <phillips@arcor.de>
> To: Marcelo Tosatti <marcelo@conectiva.com.br>,
> "Christian Ehrhardt" <ehrhardt@mathematik.uni-ulm.de>
> Subject: [CFT] [PATCH] LRU race fix
> Date: Tue, 17 Sep 2002 19:03:19 +0200
> X-Mailer: KMail [version 1.3.2]
> Cc: <linux-kernel@vger.kernel.org>
>
> This patch against 2.4.20-pre7 fixes a theoretical race where a page could
> possibly be freed while still on the lru list. The details have been
> discussed at length earlier, see "[RFC] [PATCH] Include LRU in page count".
>
> The race may not even be that theoretical, it's just so rare that when it
> does happen, it might be dismissed as a driver problem or similar...
>
> [...]
>
>>>>>>>>>>>>

> Somehow this does completely fix my problem, I have taken out all
> the tlb related hacks and the testcase that caused the problem 100%
> now runs without any sign of problems :))
>
> Now I am wondering if that is just coincidence or why m68k hit that
> error so reliably.. is it supposed to have any effect at all on
> UP?

Are you running UP+preempt?

-- 
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Oct 07 2002 - 22:00:25 EST