Re: 2.6.24-rc2-mm1

From: Kay Sievers
Date: Thu Nov 15 2007 - 12:16:11 EST


On Thu, 2007-11-15 at 09:06 -0800, Greg KH wrote:
> On Thu, Nov 15, 2007 at 04:14:07PM +0800, Dave Young wrote:
> > On Thu, Nov 15, 2007 at 03:38:13AM +0100, Kay Sievers wrote:
> > > On Thu, 2007-11-15 at 09:01 +0800, Dave Young wrote:
> > > > On Nov 15, 2007 5:27 AM, Kay Sievers <kay.sievers@xxxxxxxx> wrote:
> > > > > On Wed, 2007-11-14 at 20:19 +0100, Jiri Kosina wrote:
> > > > > > On Wed, 14 Nov 2007, Kay Sievers wrote:
> > > > > >
> > > > > > > Could it be an init-order problem, where something tries to use the
> > > > > > > block subsystem? Before it is initialized with:
> > > > > > > block/genhd.c :: subsys_initcall(genhd_device_init);
> > > > > > > If that's the case, we have an old bug that nobody noticed with static
> > > > > > > structures, which are zeroed that time, but definitely not properly
> > > > > > > initialized. I'll try to build loop non-modular now, and see if that
> > > > > > > makes the bug appear here.
> > > > >
> > > > > > my .config with which I reproduc this on 2.6.24-rc2-mm1 reliably can be
> > > > > > obtained from http://www.jikos.cz/jikos/junk/.config
> > > > >
> > > > > Hmm, that config doesn't do anything here, and if I make it boot, it
> > > > > does not show the bug.
> > > > >
> > > > > Could you possibly enable kobject debugging and see if that exposes
> > > > > something, maybe something goes wrong with the kset refcount and it gets
> > > > > released while in use.
> > > > >
> > > > Hi,
> > > > I would do that.
> > >
> > > That would be great.
> > >
> > > > BTW, The bug report as EIP at __list_add with CONFIG_DEBUG_LIST=y
> > >
> > > Yeah, that hints that the kset, which contains the list, is not
> > > allocated at the time it is used, or it is already released (kfree)
> > > again by some buggy logic.
> > Yes, I debugged it, there's some new findings.
> > It is freed by put_disk.
> > The floppy driver alloc_disk and then call put_disk without register_disk.
> > in kobject_cleanup line 551:
> > if(s)
> > kset_put(s);
> > Now the kset is set in alloc_disk after kobject_init, so it is not refereced yet.
> > please try this patch:
> >
> > block/genhd.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff -upr linux/block/genhd.c linux.new/block/genhd.c
> > --- linux/block/genhd.c 2007-11-15 15:59:11.000000000 +0800
> > +++ linux.new/block/genhd.c 2007-11-15 15:59:39.000000000 +0800
> > @@ -718,9 +718,9 @@ struct gendisk *alloc_disk_node(int mino
> > }
> > }
> > disk->minors = minors;
> > - kobject_init(&disk->kobj);
> > disk->kobj.kset = block_kset;
> > disk->kobj.ktype = &ktype_block;
> > + kobject_init(&disk->kobj);
> > rand_initialize_disk(disk);
> > INIT_WORK(&disk->async_notify,
> > media_change_notify_thread);
>
> Ah, yes, that is a bug, and it's my fault, let me go fix that in my
> patch series.

Oh, this is an old bug, that just didn't crash with the static ksets, it
did all the refcounting wrong, but nobody noticed it because the kset
data was still there.

Kay

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/