Re: 2.6.19-rc5-mm2 (end earlier): WARNING at lib/kobject.c:172 kobject_init() on resume from disk

From: Rafael J. Wysocki
Date: Sun Nov 26 2006 - 07:25:24 EST


On Sunday, 26 November 2006 00:43, Greg KH wrote:
> On Sun, Nov 26, 2006 at 12:15:52AM +0100, Rafael J. Wysocki wrote:
> > On Saturday, 25 November 2006 23:20, Rafael J. Wysocki wrote:
> > > On Wednesday, 22 November 2006 22:44, Andrew Morton wrote:
> > > > On Wed, 22 Nov 2006 22:07:06 +0100
> > > > "Rafael J. Wysocki" <rjw@xxxxxxx> wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I get similar traces on every resume from disk on SMP systems:
> > > > >
> > > > > WARNING at lib/kobject.c:172 kobject_init()
> > > > >
> > > > > Call Trace:
> > > > > [<ffffffff80265559>] dump_trace+0xaa/0x3fd
> > > > > [<ffffffff802658e8>] show_trace+0x3c/0x52
> > > > > [<ffffffff80265913>] dump_stack+0x15/0x17
> > > > > [<ffffffff8031c1ad>] kobject_init+0x3f/0x8a
> > > > > [<ffffffff8031c298>] kobject_register+0x1a/0x3e
> > > > > [<ffffffff8038e5b4>] sysdev_register+0x5f/0xec
> > > > > [<ffffffff8026af39>] mce_create_device+0x79/0x103
> > > > > [<ffffffff8026afed>] mce_cpu_callback+0x2a/0xbd
> > > > > [<ffffffff8026112f>] notifier_call_chain+0x29/0x3e
> > > > > [<ffffffff8028e809>] raw_notifier_call_chain+0x9/0xb
> > > > > [<ffffffff80299f18>] _cpu_up+0xc2/0xd5
> > > > > [<ffffffff80299f56>] cpu_up+0x2b/0x42
> > > > > [<ffffffff80299fbb>] enable_nonboot_cpus+0x4e/0x9b
> > > > > [<ffffffff802a35da>] snapshot_ioctl+0x1a0/0x5d2
> > > > > [<ffffffff8023d9cd>] do_ioctl+0x5e/0x77
> > > > > [<ffffffff8022d785>] vfs_ioctl+0x256/0x273
> > > > > [<ffffffff8024770b>] sys_ioctl+0x5f/0x82
> > > > > [<ffffffff8025811e>] system_call+0x7e/0x83
> > > > > DWARF2 unwinder stuck at system_call+0x7e/0x83
> > > > > Leftover inexact backtrace:
> > > > >
> > > > > False positive?
> > > > >
> > > >
> > > > Don't know. The changelog in
> > > > http://www.kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/gregkh-01-driver/kobject-warn.patch
> > > > is pretty pathetic.
> > > >
> > > > Perhaps mce_remove_device() isn't being called.
> > >
> > > I've added some debugging code into mce_remove_device() which shows that it is
> > > being called when the CPU is removed.
> > >
> > > Investigation continues.
> >
> > Ah, I think the problem is that the last user of a kobject doesn't decrease
> > the refcount in kref_put(), so if the same kobject is registered for the
> > second time, the refcount is still one and the warning triggers.
>
> But the last user of the kobject should cause the kobject to be freed
> and disappear. It should not hang around, right?
>
> Oh yuck, this is a static struct device, one per cpu :(
>
> > So, it seems, this is a false positive and I think we can get rid of it in the
> > following way (tested and works):
> >
> > ---
> > Make mce_remove_device() clean up the kobject in per_cpu(device_mce, cpu)
> > after it has been unregistered.
> >
> > Signed-off-by: Rafael J. Wysocki <rjw@xxxxxxx>
> > ---
> > arch/x86_64/kernel/mce.c | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > Index: linux-2.6.19-rc6-mm1/arch/x86_64/kernel/mce.c
> > ===================================================================
> > --- linux-2.6.19-rc6-mm1.orig/arch/x86_64/kernel/mce.c 2006-11-25 23:56:08.000000000 +0100
> > +++ linux-2.6.19-rc6-mm1/arch/x86_64/kernel/mce.c 2006-11-26 00:15:34.000000000 +0100
> > @@ -651,6 +651,7 @@ static void mce_remove_device(unsigned i
> > sysdev_remove_file(&per_cpu(device_mce,cpu), &attr_tolerant);
> > sysdev_remove_file(&per_cpu(device_mce,cpu), &attr_check_interval);
> > sysdev_unregister(&per_cpu(device_mce,cpu));
> > + per_cpu(device_mce, cpu).kobj = (struct kobject){ 0 };
>
> memset the kobj instead perhaps? Yeah, I guess this copy will work, as
> the compiler turns it into a memset.

Patch with the memset follows.

BTW, it seems to me that the WARN_ON in kref_get will never trigger, will it?

Greetings,
Rafael


---
Make mce_remove_device() clean up the kobject in per_cpu(device_mce, cpu)
after it has been unregistered.

Signed-off-by: Rafael J. Wysocki <rjw@xxxxxxx>
---
arch/x86_64/kernel/mce.c | 1 +
1 file changed, 1 insertion(+)

Index: linux-2.6.19-rc6-mm1/arch/x86_64/kernel/mce.c
===================================================================
--- linux-2.6.19-rc6-mm1.orig/arch/x86_64/kernel/mce.c 2006-11-26 11:31:38.000000000 +0100
+++ linux-2.6.19-rc6-mm1/arch/x86_64/kernel/mce.c 2006-11-26 12:02:10.000000000 +0100
@@ -651,6 +651,7 @@ static void mce_remove_device(unsigned i
sysdev_remove_file(&per_cpu(device_mce,cpu), &attr_tolerant);
sysdev_remove_file(&per_cpu(device_mce,cpu), &attr_check_interval);
sysdev_unregister(&per_cpu(device_mce,cpu));
+ memset(&per_cpu(device_mce, cpu).kobj, 0, sizeof(struct kobject));
}

/* Get notified when a cpu comes on/off. Be hotplug friendly. */

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/