Re: [PATCH] make idr_remove_all() do removal -before- free_layer()

From: Paul E. McKenney
Date: Sun Mar 08 2009 - 15:20:38 EST


On Sun, Mar 08, 2009 at 04:33:36PM +0100, Ingo Molnar wrote:
>
> * Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
>
> > The following patch fixes a problem in the IDR system, where
> > an idr_remove_all() hands a data element to call_rcu() (via
> > free_layer()) before making that data element inaccessible to
> > new readers. This is very bad, and results in readers still
> > having a reference to this data element at the end of the
> > grace period. Tests on large machines that concurrently map
> > and unmap user-space memory within the same multithreaded
> > process result in crashes within about five minutes. Applying
> > this patch increases the kernel's longevity to the
> > three-to-eight-hour range.
> >
> > There appear to be other similar problems in
> > idr_get_empty_slot() and sub_remove(), but I fixed the easy
> > one in idr_remove_all() first. It is therefore no surprise
> > that failures still occur.
> >
> > (Yes, and I did look at the relevant patch last year without
> > spotting this one. Goes to show the value of testing as well
> > as code review, I guess...)
> >
> > Nadia, Manfred, any thoughts?
> >
> > Located-by: Milton Miller II <miltonm@xxxxxxxxxxxxxx>
> > Tested-by: Milton Miller II <miltonm@xxxxxxxxxxxxxx>
> > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
>
> Hm, looks like something we really want to see fixed in
> 2.6.29-final, right?

This was located in real testing, so I agree that it is pretty high
priority. So this patch should go into 2.6.29.

The priority of the remaining yet-as-unknown fixes depends on their
complexity and risk.

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/