Re: slub/debugobjects: lockup when freeing memory

From: Paul E. McKenney
Date: Thu Jun 19 2014 - 16:42:32 EST


On Thu, Jun 19, 2014 at 04:32:38PM -0400, Sasha Levin wrote:
> On 06/19/2014 04:29 PM, Paul E. McKenney wrote:
> > On Thu, Jun 19, 2014 at 09:29:08PM +0200, Thomas Gleixner wrote:
> >> > On Thu, 19 Jun 2014, Paul E. McKenney wrote:
> >> >
> >>> > > On Thu, Jun 19, 2014 at 10:03:04AM -0500, Christoph Lameter wrote:
> >>>> > > > On Thu, 19 Jun 2014, Sasha Levin wrote:
> >>>> > > >
> >>>>> > > > > [ 690.770137] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63)
> >>>>> > > > > [ 690.770137] __slab_alloc (mm/slub.c:1732 mm/slub.c:2205 mm/slub.c:2369)
> >>>>> > > > > [ 690.770137] ? __lock_acquire (kernel/locking/lockdep.c:3189)
> >>>>> > > > > [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312)
> >>>>> > > > > [ 690.770137] kmem_cache_alloc (mm/slub.c:2442 mm/slub.c:2484 mm/slub.c:2489)
> >>>>> > > > > [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312)
> >>>>> > > > > [ 690.770137] ? debug_object_activate (lib/debugobjects.c:439)
> >>>>> > > > > [ 690.770137] __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312)
> >>>>> > > > > [ 690.770137] debug_object_init (lib/debugobjects.c:365)
> >>>>> > > > > [ 690.770137] rcuhead_fixup_activate (kernel/rcu/update.c:231)
> >>>>> > > > > [ 690.770137] debug_object_activate (lib/debugobjects.c:280 lib/debugobjects.c:439)
> >>>>> > > > > [ 690.770137] ? discard_slab (mm/slub.c:1486)
> >>>>> > > > > [ 690.770137] __call_rcu (kernel/rcu/rcu.h:76 (discriminator 2) kernel/rcu/tree.c:2585 (discriminator 2))
> >>>> > > >
> >>>> > > > __call_rcu does a slab allocation? This means __call_rcu can no longer be
> >>>> > > > used in slab allocators? What happened?
> >>> > >
> >>> > > My guess is that the root cause is a double call_rcu(), call_rcu_sched(),
> >>> > > call_rcu_bh(), or call_srcu().
> >>> > >
> >>> > > Perhaps the DEBUG_OBJECTS code now allocates memory to report errors?
> >>> > > That would be unfortunate...
> >> >
> >> > Well, no. Look at the callchain:
> >> >
> >> > __call_rcu
> >> > debug_object_activate
> >> > rcuhead_fixup_activate
> >> > debug_object_init
> >> > kmem_cache_alloc
> >> >
> >> > So call rcu activates the object, but the object has no reference in
> >> > the debug objects code so the fixup code is called which inits the
> >> > object and allocates a reference ....
> > OK, got it. And you are right, call_rcu() has done this for a very
> > long time, so not sure what changed.
>
> It's probable my fault. I've introduced clone() and unshare() fuzzing.
>
> Those two are full with issues and I've been waiting with enabling those
> until the rest of the kernel could survive trinity for more than an hour.

Well, that might explain why I haven't seen it in my testing. ;-)

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/