Re: [PATCH v6] amba: Remove deferred device addition

From: Isaac Manjarres
Date: Thu Aug 11 2022 - 13:12:47 EST


On Wed, Aug 10, 2022 at 09:09:26PM -0700, Guenter Roeck wrote:
> On 8/10/22 19:28, Saravana Kannan wrote:
> > On Wed, Aug 10, 2022 at 7:03 PM Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
> > >
> > > On 8/10/22 14:47, Isaac Manjarres wrote:
> > > > On Wed, Aug 10, 2022 at 05:58:58AM -0700, Guenter Roeck wrote:
> > > > > On 8/9/22 20:33, Saravana Kannan wrote:
> > > > > [ ... ]
> > > > >
> > > > > >
> > > > > > Can you give me more details on the qemu configuration so I could try
> > > > > > to reproduce it?
> > > > >
> > > > > qemu-system-arm -M vexpress-a9 -kernel arch/arm/boot/zImage -no-reboot \
> > > > > -initrd rootfs-armv5.cpio -m 128 \
> > > > > --append "rdinit=/sbin/init console=ttyAMA0,115200" \
> > > > > -dtb arch/arm/boot/dts/vexpress-v2p-ca9.dtb \
> > > > > -nographic -monitor null -serial stdio
> > > > >
> > > > > using multi_v7_defconfig will hang nicely with your patch applied,
> > > > > and boot as expected without. This was with qemu v7.0, but I am
> > > > > sure older qemu versions will show the same behavior. The initrd
> > > > > used should not matter, but you'll find it at
> > > > > https://github.com/groeck/linux-build-test/blob/master/rootfs/arm-v7/rootfs-armv5.cpio.gz
> > > > >
> > > > > Guenter
> > > > >
> > > > Hi Guenter,
> > > >
> > > > Thanks for the information; I was able to reproduce this on my end as
> > > > well. The following changes fixed the problem for me. Can you please try
> > > > them out?
> > > >
> > > > diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> > > > index 70f79fc71539..b377f18d8acc 100644
> > > > --- a/drivers/base/dd.c
> > > > +++ b/drivers/base/dd.c
> > > > @@ -881,6 +881,11 @@ static int __device_attach_driver(struct device_driver *drv, void *_data)
> > > > dev_dbg(dev, "Device match requests probe deferral\n");
> > > > dev->can_match = true;
> > > > driver_deferred_probe_add(dev);
> > > > + /*
> > > > + * Device can't match with the bus right now, so don't attempt
> > > > + * to match or bind with other drivers on the bus.
> > > > + */
> > > > + return ret;
> > > > } else if (ret < 0) {
> > > > dev_dbg(dev, "Bus failed to match device: %d\n", ret);
> > > > return ret;
> > > > @@ -1120,6 +1125,11 @@ static int __driver_attach(struct device *dev, void *data)
> > > > dev_dbg(dev, "Device match requests probe deferral\n");
> > > > dev->can_match = true;
> > > > driver_deferred_probe_add(dev);
> > > > + /*
> > > > + * Driver could not match with device, but may match with
> > > > + * another device on the bus.
> > > > + */
> > > > + return 0;
> > > > } else if (ret < 0) {
> > > > dev_dbg(dev, "Bus failed to match device: %d\n", ret);
> > > > return ret;
> > > >
> >
> > Thanks Isaac for debugging and providing a fix! It's surprising that
> > no one noticed this lack of "return"s for so long.
> >
> > >
> > > Most of the tests pass with the above applied, but there is still one crash.
> >
> > Good to hear it's mostly good now.
> >
> > > 8<--- cut here ---^M
> > > Unhandled fault: page domain fault (0x81b) at 0x00000122^M
> > > [00000122] *pgd=00000000^M
> > > Internal error: : 81b [#1] ARM^M
> > > Modules linked in:^M
> > > CPU: 0 PID: 1 Comm: swapper Tainted: G N 5.19.0+ #1^M
> > > Hardware name: ARM-Versatile (Device Tree Support)^M
> > > PC is at do_alignment_ldrstr+0x7c/0x164^M
> > > LR is at ai_half+0x0/0x4^M
> > > pc : [<c001fa00>] lr : [<c0ca1278>] psr: 60000113^M
> > > sp : c8811d68 ip : 00000003 fp : 00000004^M
> > > r10: c05433e4 r9 : c0ca1278 r8 : 00000801^M
> > > r7 : 00000122 r6 : 00000000 r5 : e5823000 r4 : c8811df8^M
> > > r3 : 00000100 r2 : c8811df8 r1 : 00000000 r0 : 00000122^M
> > > Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none^M
> > > Control: 00093177 Table: 01698000 DAC: 00000051^M
> > > Register r0 information: non-paged memory^M
> > > Register r1 information: NULL pointer^M
> > > Register r2 information: 2-page vmalloc region starting at 0xc8810000 allocated at kernel_clone+0x70/0x644^M
> > > Register r3 information: non-paged memory^M
> > > Register r4 information: 2-page vmalloc region starting at 0xc8810000 allocated at kernel_clone+0x70/0x644^M
> > > Register r5 information: non-paged memory^M
> > > Register r6 information: NULL pointer^M
> > > Register r7 information: non-paged memory^M
> > > Register r8 information: non-paged memory^M
> > > Register r9 information: non-slab/vmalloc memory^M
> > > Register r10 information: non-slab/vmalloc memory^M
> > > Register r11 information: non-paged memory^M
> > > Register r12 information: non-paged memory^M
> > > Process swapper (pid: 1, stack limit = 0x(ptrval))^M
> > > Stack: (0xc8811d68 to 0xc8812000)^M
> > > 1d60: c8811df8 e5823000 00000000 c00201dc 00000000 00000000^M
> > > 1d80: c0bf2fd4 60000013 00000000 c006c440 00000001 00000000 e5823000 c053cdc0^M
> > > 1da0: 00000000 c0be786c c1496d40 00000801 c0bec3f8 c001ffdc c8811df8 00000122^M
> > > 1dc0: c1496d40 c0bc4858 00000000 c001d66c c12ae950 00000001 c0c5f3b0 c1496d40^M
> > > 1de0: c05433e4 20000013 ffffffff c8811e2c c1496d40 c00095c4 00000001 00000001^M
> > > 1e00: 00000122 00000100 c24d8cc0 c1552c00 00000fff c0c5f230 c1496d40 c24d6e58^M
> > > 1e20: c0bc4858 00000000 c1497330 c8811e48 00000001 c05433e8 20000013 ffffffff^M
> > > 1e40: 00000053 c05433cc ffffffed c1552c00 00000fff c0c5f230 c1496d40 c24d6e58^M
> > > 1e60: c0bc4858 c053a784 c1552c00 c0c5f294 c0c5f294 c053a808 00000000 c1552c00^M
> > > 1e80: c0c5f294 c05d9f94 00000000 c0c5f294 c05d9f74 c0c5f230 c1496d40 c05d7984^M
> > > 1ea0: 00000000 c1540eac c7cd5fb4 c0be786c c0c5f294 c24d6e00 00000000 c05d8b7c^M
> > > 1ec0: c0acccd4 c0c92600 c1496d40 c0c5f294 c0c92600 c1496d40 00000000 c1496d40^M
> > > 1ee0: c0ca1000 c05dab3c c0bb0690 c0c92600 c1496d40 c000a8b0 00000000 00000000^M
> > > 1f00: c14d7e4b c0b5f800 000000bf c0047c5c c0b5e9b4 00000000 c0c92600 c088e078^M
> > > 1f20: c1496d40 00000007 c0c92600 c14d7e00 c0bc4874 c0b5e9b4 c0ca1000 c0bc4858^M
> > > 1f40: 00000000 c0be786c c0bdda90 00000008 c14d7e00 c0bc4878 c0b5e9b4 c0b93230^M
> > > 1f60: 00000007 00000007 00000000 c0b92400 00000000 000000bf 00000000 00000000^M
> > > 1f80: c088e340 00000000 00000000 00000000 00000000 00000000 00000000 c088e350^M
> > > 1fa0: 00000000 c088e340 00000000 c00084f8 00000000 00000000 00000000 00000000^M
> > > 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000^M
> > > 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000^M
> > > do_alignment_ldrstr from do_alignment+0x200/0x984^M
> > > do_alignment from do_DataAbort+0x38/0xb8^M
> > > do_DataAbort from __dabt_svc+0x64/0xa0^M
> > > Exception stack(0xc8811df8 to 0xc8811e40)^M
> > > 1de0: 00000001 00000001^M
> > > 1e00: 00000122 00000100 c24d8cc0 c1552c00 00000fff c0c5f230 c1496d40 c24d6e58^M
> > > 1e20: c0bc4858 00000000 c1497330 c8811e48 00000001 c05433e8 20000013 ffffffff^M
> > > __dabt_svc from __clk_put+0x34/0x174^M
> > > __clk_put from amba_read_periphid+0xd8/0x120^M
> >
> > What the heck! How is clk_put() failing. I'll leave this to Isaac too.
> >
> > > amba_read_periphid from amba_match+0x3c/0x84^M
> > > amba_match from __driver_attach+0x20/0x114^M
> > > __driver_attach from bus_for_each_dev+0x74/0xc0^M
> > > bus_for_each_dev from bus_add_driver+0x154/0x1e8^M
> > > bus_add_driver from driver_register+0x74/0x10c^M
> > > driver_register from do_one_initcall+0x8c/0x2fc^M
> > > do_one_initcall from kernel_init_freeable+0x190/0x220^M
> > > kernel_init_freeable from kernel_init+0x10/0x108^M
> > > kernel_init from ret_from_fork+0x14/0x3c^M
> > > Exception stack(0xc8811fb0 to 0xc8811ff8)^M
> > > 1fa0: 00000000 00000000 00000000 00000000^M
> > > 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000^M
> > > 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000^M
> > > Code: e3a00002 e782310c e8bd8070 e792310c (e4c03001) ^M
> > > ---[ end trace 0000000000000000 ]---^M
> > > Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b^M
> > >
> > > This is with versatile_defconfig and versatileab. Let me know if you need details.
> >
> > Is the dts that's generated called versatilbeab.dts?
> >
>
> Yes. Here is the qemu command line:
>
> qemu-system-arm -M versatileab -kernel arch/arm/boot/zImage -no-reboot \
> -initrd rootfs-armv5.cpio -m 128 \
> --append "rdinit=/sbin/init console=ttyAMA0,115200 earlycon" \
> -dtb arch/arm/boot/dts/versatile-ab.dtb -nographic -monitor null \
> -serial stdio
>
> Guenter

Thanks for testing that out, and the additional information, Guenter.
I'll look into the crash in __clk_put() and get back to you.

--Isaac