Re: [PATCH v4 1/2] of: create of_root if no dtb provided

From: Rob Herring
Date: Wed Mar 27 2024 - 17:57:54 EST


On Wed, Mar 27, 2024 at 2:47 PM Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
>
> On Wed, Mar 27, 2024 at 01:38:13PM -0500, Rob Herring wrote:
> > On Wed, Mar 27, 2024 at 9:40 AM Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
> > >
> > > On 3/27/24 06:11, Rob Herring wrote:
> > > > On Wed, Mar 20, 2024 at 3:06 PM Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
> > > >>
> > > >> On 3/20/24 12:14, Rob Herring wrote:
> > > >>> On Mon, Mar 18, 2024 at 4:31 PM Guenter Roeck <linux@xxxxxxxxxxxx>
> > > >>> wrote:
> > > >>>>
> > > >>>> On 3/18/24 12:26, Rob Herring wrote:
> > > >>>>> +Stephen
> > > >>>>>
> > > >>>>> On Mon, Mar 18, 2024 at 12:09 PM Guenter Roeck <linux@xxxxxxxxxxxx>
> > > >>>>> wrote:
> > > >>>>>>
> > > >>>>>> Hi,
> > > >>>>>>
> > > >>>>>> On Fri, Mar 17, 2023 at 12:34:14AM -0500, Frank Rowand wrote:
> > > >>>>>>> When enabling CONFIG_OF on a platform where of_root is not
> > > >>>>>>> populated by firmware, we end up without a root node. In order to
> > > >>>>>>> apply overlays and create subnodes of the root node, we need one.
> > > >>>>>>> Create this root node by unflattening an empty builtin dtb.
> > > >>>>>>>
> > > >>>>>>> If firmware provides a flattened device tree (FDT) then the FDT is
> > > >>>>>>> unflattened via setup_arch(). Otherwise setup_of(), which is
> > > >>>>>>> called immediately after setup_arch(), will create the default root
> > > >>>>>>> node if it does not exist.
> > > >>>>>>>
> > > >>>>>>> Signed-off-by: Frank Rowand <frowand.list@xxxxxxxxx>
> > > >>>>>>
> > > >>>>>> This patch results in a crash on nios2.
> > > >>>>>
> > > >>>>> This patch was never applied. I assume you meant a later version of
> > > >>>>> it that did get applied.
> > > >>>>>
> > > >>>>>>
> > > >>>>>> Building nios2:10m50-ghrd:10m50_defconfig:10m50_devboard.dts ..
> > > >>>>>> running ...R failed (crashed)
> > > >>>>>
> > > >>>>> Booting with DT?
> > > >>>>>
> > > >>>>>> ------------ qemu log: earlycon: uart8250 at MMIO32 0x18001600
> > > >>>>>> (options '') printk: legacy bootconsole [uart8250] enabled Linux
> > > >>>>>> version 6.8.0-11409-gf6cef5f8c37f (groeck@desktop) (nios2-linux-gcc
> > > >>>>>> (GCC) 11.4.0, GNU ld (GNU Binutils) 2.40) #1 Sun Mar 17 23:38:59 PDT
> > > >>>>>> 2024 Kernel panic - not syncing: early_init_dt_alloc_memory_arch:
> > > >>>>>> Failed to allocate 72 bytes align=0x40 ---[ end Kernel panic - not
> > > >>>>>> syncing: early_init_dt_alloc_memory_arch: Failed to allocate 72
> > > >>>>>> bytes align=0x40 ]---
> > > >>>>>
> > > >>>>> nios2 looks utterly broken to me. This change should be a nop unless
> > > >>>>> initial_boot_params is NULL. It looks like it is possible for r6 (dtb
> > > >>>>> address) to be 0 depending on kconfig options, but that would have
> > > >>>>> skipped copying and unflattening which would then panic in
> > > >>>>> setup_cpuinfo(). If initial_boot_params is not NULL, then the same
> > > >>>>> early_init_dt_alloc_memory_arch() calls should fail when copying the
> > > >>>>> DT. So I don't see how nios2 booting with DT ever worked.
> > > >>>>>
> > > >>>>
> > > >>>> For nios2, in early_init_devtree():
> > > >>>>
> > > >>>> void __init early_init_devtree(void *params) { __be32 *dtb = (u32
> > > >>>> *)__dtb_start; ... if (be32_to_cpu((__be32) *dtb) == OF_DT_HEADER)
> > > >>>> params = (void *)__dtb_start;
> > > >>>>
> > > >>>> That worked fine until this patch. Starting with this patch,
> > > >>>> __dtb_start always points to a valid empty devicetree blob, which
> > > >>>> overrides the devicetree blob passed to early_init_devtree(). This
> > > >>>> causes the problem.
> > > >>>
> > > >>> With an external DTB, it doesn't boot with or without this patch. It
> > > >>> just dies in different spots. Before it just skipped any memory
> > > >>
> > > >> No, that is incorrect.
> > > >
> > > > Well, I can tell you it doesn't boot for me. So I must be doing something
> > > > different from your setup.
> > > >
> > >
> > > Maybe you have OF_UNITTEST enabled and it indeed results in the problem you
> > > mention below. I don't have it enabled because it produces various
> > > backtraces which would hide real problems.
> >
> > I thought of that, but I don't think I did. What I suspect is the external
> > dtb is at address 0.

Testing again, I built 10m50_defconfig without modification and ran
qemu (from debian testing):

qemu-system-nios2 -dtb
build-nios2/arch/nios2/boot/dts/10m50_devboard.dtb -kernel
build-nios2/vmlinux --nographic

I had to enable CONFIG_NIOS2_PASS_CMDLINE (which really means pass
cmdline, dtb, and initrd from bootloader) and it works and regresses
as you report.

> > > >> Up to this patch it booted just fine with an external dtb using the
> > > >> "-initrd" command line argument, and I explained to you above why this
> > > >> is the case.
> > > >
> > > > What does -initrd have to do with anything? Does that shift where the
> > > > external dtb is placed or something?
> > > >
> > >
> > > Nothing. I meant to say -dtb.
> > >
> > > > I think I see the issue. __dtb_start points to the start of *all*
> > > > built-in DTBs, not a specific one. In this case, arc, csky, loongarch,
> > > > mips, openrisc, riscv, sh, and xtensa may all be broken too (if one picks
> > > > the magic combination of booting modes and kconfig options). I
> > >
> > > No.
> > >
> > > - arc only picks the internal dtb if use_embedded_dtb is true. This flag is
> > > only set if there is no external dtb, or if the external dtb does not
> > > provide a valid machine description.
> >
> > Right, but when it does pick the internal dtb, it is expecting its dtb at
> > __dtb_start. What I'm saying is that's never been a good or safe assumption.
> > We just happened to add another case to trigger it. The only reliable way to
> > get a built-in DTB is if foo.dtb is built-in, then use __dtb_foo_begin to get
> > its address. That's what some MIPS platforms with multiple DTBs do.
> >
>
> I may be missing something, but it seems to me that most if not all
> platforms with support for configurable built-in dtbs use __dtb_start
> to get its address.

That was my concern as that only points to the 1st DTB and nothing
prevents there being more than one. But I think link order saves all
of them after all.

> > > - openrisc only picks the internal dtb if no external dtb is provided -
> > > riscv only picks the internal dtb if CONFIG_BUILTIN_DTB is enabled. - sh
> > > only used the internal dtb if CONFIG_USE_BUILTIN_DTB is enabled. - xtensa
> > > only picks the internal dtb if there is no external dtb.
> > >
> > > However, nios2 picks the internal dtb _even if_ an external dtb is provided
> > > if there is an internal dtb. In other words, it prefers the internal dtb
> > > over the external dtb. All other architectures prefer the external dtb over
> > > the internal dtb.
> >
> > Thanks for the analysis. I had started and abandoned common support (mostly
> > Kconfig options) for built-in dtbs years ago. I decided against it because it
> > is not something we want to encourage (as the boot dtb). In the meantime,
> > we've gained new architectures that have added it. Sigh. So now I'm
> > reconsidering something common (though not for v6.9).
> >
> > >
> > > > would expect all these cases have been broken forever if the DT unittest
> > > > is enabled as it too adds a built-in dtb. But I would also
> > >
> > > Even if that is correct for nios2, that hardly seems to be an argument to
> > > break nios2 boot with external dtb unconditionally.
> >
> > That wasn't an argument for breaking it. Using an external dtb should really
> > be the default and strongly preferred though.
> >
> > I'm still not sure how to fix this easily for 6.9. Something like what
> > microblaze does which puts the boot dtb under a consistent symbol name. Or
> > perhaps we could iterate thru the built-in dtbs and skip ones without
> > top-level compatible.
> >
>
> I did submit a patch to only override the external dtb if support for the
> internal dtb is enabled. I copied you on it, so you should have seen it.
>
> https://lore.kernel.org/linux-kernel/20240322065419.162416-1-linux@xxxxxxxxxxxx/

Now reviewed, thanks. Sadly, I rarely see anything outside the normal
workflow of what I can filter out. I'm copied pretty much 1:1 with
what is sent to the DT list which is a fire hose.

Rob