Re: [2.6.26] kobject_add_internal failed for 2:0 with -EEXIST / unable to handle kernel NULL pointer dereference in sysfs_create_link

From: Kay Sievers
Date: Thu Oct 30 2008 - 19:07:14 EST


On Thu, Oct 30, 2008 at 11:55, Folkert van Heusden
<folkert@xxxxxxxxxxxxxx> wrote:
>> >> >> >> > While running my http://vanheusden.com/pyk/ script (which randomly
>> >> >> >> > inserts and removes modules) I triggered the folllowing oops in a 2.6.26
>> >> >> >> > kernel on an IBM xSeries 260. This oops (in fact no oops at all) did not
>> >> >> >> > get triggered in a 2.6.18 kernel on that system.
>> >> >> >> >
>> >> >> >> > [ 42.507375] FDC 0 is a National Semiconductor PC87306
>> >> >> >> > [ 42.509057] kobject_add_internal failed for 2:0 with -EEXIST, don't try to register things with the same name in the same directory.
>> >> >> >> > [ 42.509291] Pid: 5301, comm: modprobe Not tainted 2.6.26-1-amd64 #1
>> >> >> >> > [ 42.509431]
>> >> >> >> > [ 42.509433] Call Trace:
>> >> >> >> > [ 42.509685] [<ffffffff8031b031>] kobject_add_internal+0x13f/0x17e
>> >> > ...
>> >> >> >> > [ 42.511519] [<ffffffff8027d23b>] bdi_register+0x57/0xb4
>> >> >> >>
>> >> >> >> Looks like bdi sees two devices with the same devnum, or didn't
>> >> >> >> cleanup an old entry. What does: ls -l "/sys/class/bdi/" print?
>> >> >> >
>> >> >> > The following:
>> >> >> > folkert@debiantesthw:~$ ls -l /sys/class/bdi/
>> >> >> > drwxr-xr-x 3 root root 0 2008-10-28 18:32 2:0
>> >> >> > drwxr-xr-x 3 root root 0 2008-10-28 18:32 2:1
>> >> >>
>> >> >> Oh, you are running the old sysfs layout without symlinks. Care to
>> >> >> tell where the "device" link in these directories points to?
>> >> >
>> >> > None exist:
>> >> > folkert@debiantesthw:~$ ls -la /sys/class/bdi/*/device
>> >> > ls: cannot access /sys/class/bdi/*/device: No such file or directory
>> >>
>> >> Ah, sorry. Seems the bdi stuff never got to pass the usual parent
>> >> device with the device registration, to let the bdi device show up at
>> >> the right place in the device tree.
>> >>
>> >> Let's see what current devices on your box have the major 2:
>> >> find /sys -name dev | xargs grep '^2:'
>> >
>> > /sys/block/fd0/dev:2:0
>> > /sys/block/fd1/dev:2:1
>> >
>> > As my script does modprobe/rmmod in parallel (4 processes) maybe it is a
>> > conflict of one process doing an modprobe of floppy while the other does
>> > an rmmod? Or both a modprobe?
>>
>> Might be, yes. If you just bootup, and don't run your modprobe/rmmod
>> script, does the box have 2 floppy devices in /sys too?
>
> Yes it does. One physical drive.

Seems that always happens with multiple floppies. I can reproduce it
here with qemu. It seems not related to modprobing. Also mtd devices
suffer from the same problem, as bug reports show.

It might be a bug in bdi. Looks like floppies share a single queue,
the bdi structure lives in the queue. Now we register for every device
a bdi device, but the queue is shared and the former recorded dev_t in
the bdi structure is overwritten. At unregistering the bdi device, all
earlier devices using the same queue are not removed.

Peter, please check, if something like this can happen?

Thanks,
Kay
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/