Re: [RFT 1/2] RISC-V: handle missing "no-map" properties for OpenSBI's PMP protected regions

From: Atish Kumar Patra
Date: Wed Aug 09 2023 - 05:01:27 EST


On Tue, Aug 8, 2023 at 6:39 AM Conor Dooley <conor@xxxxxxxxxx> wrote:
>
> On Tue, Aug 08, 2023 at 12:54:11AM -0700, Atish Kumar Patra wrote:
> > On Wed, Aug 2, 2023 at 4:14 AM Conor Dooley <conor.dooley@xxxxxxxxxxxxx> wrote:
> > >
> > > Add an erratum for versions [v0.8 to v1.3) of OpenSBI which fail to add
> > > the "no-map" property to the reserved memory nodes for the regions it
> > > has protected using PMPs.
> > >
> > > Our existing fix sweeping hibernation under the carpet by marking it
> > > NONPORTABLE is insufficient as there are other ways to generate
> > > accesses to these reserved memory regions, as Petr discovered [1]
> > > while testing crash kernels & kdump.
> > >
> > > Intercede during the boot process when the afflicted versions of OpenSBI
> > > are present & set the "no-map" property in all "mmode_resv" nodes before
> > > the kernel does its reserved memory region initialisation.
> > >
> >
> > We have different mechanisms of DT being passed to the kernel.
> >
> > 1. A prior stage(e.g U-Boot SPL) to M-mode runtime firmware (e.g.
> > OpenSBI, rustSBI) passes the DT to M-mode runtime firmware and it
> > passes to the next stage.
> > In this case, M-mode runtime firmware gets a chance to update the
> > no-map property in DT that the kernel can parse.
> >
> > 2. User loads the DT from the boot loader (e.g EDK2, U-Boot proper).
> > Any DT patching done by the M-mode firmware is useless. If these DTBs
> > don't have the no-map
> > property, hibernation or EFI booting will have issues as well.
> >
>
> > We are trying to solve only one part of problem #1 in this patch.
>
> Correct.
>
> If someone's second stage is also providing an incorrect devicetree
> then, yeah, this approach would fall apart - but it's the firmware
> provided devicetree being incorrect that I am trying to account for
> here. If a person incorrectly constructed one, I am not really sure what
> we can do for them, they incorrect described their hardware /shrug
> My patch should of course help in some of the scenarios you mention above
> if the name of the reserved memory region from OpenSBI is propagated by
> the second-stage bootloader, but that is just an extension of case 1,
> not case 2.
>
> > I
> > don't think any other M-mode runtime firmware patches DT with no-map
> > property as well.
> > Please let me know if I am wrong about that. The problem is not
> > restricted to [v0.8 to v1.3) of OpenSBI.
>
> It comes down to Alex's question - do we want to fix this kind of
> firmware issue in the kernel? Ultimately this is a policy decision that
> "somebody" has to make. Maybe the list of firmwares that need this

IMO, we shouldn't as this is a slippery slope. Kernel can't fix every
firmware bug by having erratas.
I agree with your point below about firmware in shipping products. I
am not aware of any official products shipping anything other than
OpenSBI
either. However, I have seen users using other firmwares in their dev
environment.
IMHO, this approach sets a bad precedent for the future especially
when it only solves one part of the problem.
We shouldn't hide firmware bugs in the kernel when an upgraded
firmware is already available.

This bug is well documented in various threads and fixed in the latest
version of OpenSBI.
I am assuming other firmwares will follow it as well.

Anybody facing hibernation or efi related booting issues should just
upgrade to the latest version of firmware (e.g OpenSBI v1.3)
Latest version of Qemu will support(if not happened already) the
latest version of OpenSBI.

This issue will only manifest in kernels 6.4 or higher. Any user
facing these with the latest kernel can also upgrade the firmware.
Do you see any issue with that ?

> grows, but hopefully this bug will be fixed everywhere going forward,
> since, as you say, we now have a document.
> The reason I've only targeted OpenSBI here is that it's what is being
> shipped on people's SBCs etc by default & is also what is in the
> versions of QEMU that people seem to be using. Unless there's some
> example that I am missing, other firmwares are either not affected or
> not actually being shipped on to consumers?
>
> > The booting doc now says that no-map property must be set and any
> > usage of DTBs without that (via #1 or #2) will have unintended
> > consequences.
>
> Which is great. It unfortunately does not help one iota with stuff
> that's already in the wild.
>
> Thanks,
> Conor.
>
> > > Reported-by: Song Shuai <suagrfillet@xxxxxxxxx>
> > > Link: https://lore.kernel.org/all/CAAYs2=gQvkhTeioMmqRDVGjdtNF_vhB+vm_1dHJxPNi75YDQ_Q@xxxxxxxxxxxxxx/
> > > Reported-by: JeeHeng Sia <jeeheng.sia@xxxxxxxxxxxxxxxx>
> > > Link: https://groups.google.com/a/groups.riscv.org/g/sw-dev/c/ITXwaKfA6z8
> > > Reported-by: Petr Tesarik <petrtesarik@xxxxxxxxxxxxxxx>
> > > Closes: https://lore.kernel.org/linux-riscv/76ff0f51-d6c1-580d-f943-061e93073306@xxxxxxxxxxxxxxx/ [1]
> > > CC: stable@xxxxxxxxxxxxxxx
> > > Signed-off-by: Conor Dooley <conor.dooley@xxxxxxxxxxxxx>
> > > ---
> > > arch/riscv/include/asm/sbi.h | 5 +++++
> > > arch/riscv/kernel/sbi.c | 42 +++++++++++++++++++++++++++++++++++-
> > > arch/riscv/mm/init.c | 3 +++
> > > 3 files changed, 49 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h
> > > index 5b4a1bf5f439..5360f3476278 100644
> > > --- a/arch/riscv/include/asm/sbi.h
> > > +++ b/arch/riscv/include/asm/sbi.h
> > > @@ -252,6 +252,9 @@ enum sbi_pmu_ctr_type {
> > > #define SBI_ERR_ALREADY_STARTED -7
> > > #define SBI_ERR_ALREADY_STOPPED -8
> > >
> > > +/* SBI implementation IDs */
> > > +#define SBI_IMP_OPENSBI 1
> > > +
> > > extern unsigned long sbi_spec_version;
> > > struct sbiret {
> > > long error;
> > > @@ -259,6 +262,8 @@ struct sbiret {
> > > };
> > >
> > > void sbi_init(void);
> > > +void sbi_apply_reserved_mem_erratum(void *dtb_va);
> > > +
> > > struct sbiret sbi_ecall(int ext, int fid, unsigned long arg0,
> > > unsigned long arg1, unsigned long arg2,
> > > unsigned long arg3, unsigned long arg4,
> > > diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c
> > > index c672c8ba9a2a..aeb27263fa53 100644
> > > --- a/arch/riscv/kernel/sbi.c
> > > +++ b/arch/riscv/kernel/sbi.c
> > > @@ -5,8 +5,10 @@
> > > * Copyright (c) 2020 Western Digital Corporation or its affiliates.
> > > */
> > >
> > > +#include <linux/acpi.h>
> > > #include <linux/bits.h>
> > > #include <linux/init.h>
> > > +#include <linux/libfdt.h>
> > > #include <linux/pm.h>
> > > #include <linux/reboot.h>
> > > #include <asm/sbi.h>
> > > @@ -583,6 +585,40 @@ long sbi_get_mimpid(void)
> > > }
> > > EXPORT_SYMBOL_GPL(sbi_get_mimpid);
> > >
> > > +static long sbi_firmware_id;
> > > +static long sbi_firmware_version;
> > > +
> > > +/*
> > > + * For devicetrees patched by OpenSBI a "mmode_resv" node is added to cover
> > > + * the region OpenSBI has protected by means of a PMP. Some versions of OpenSBI,
> > > + * [v0.8 to v1.3), omitted the "no-map" property, but this trips up hibernation
> > > + * among other things.
> > > + */
> > > +void __init sbi_apply_reserved_mem_erratum(void *dtb_pa)
> > > +{
> > > + int child, reserved_mem;
> > > +
> > > + if (sbi_firmware_id != SBI_IMP_OPENSBI)
> > > + return;
> > > +
> > > + if (!acpi_disabled)
> > > + return;
> > > +
> > > + if (sbi_firmware_version >= 0x10003 || sbi_firmware_version < 0x8)
> > > + return;
> > > +
> > > + reserved_mem = fdt_path_offset((void *)dtb_pa, "/reserved-memory");
> > > + if (reserved_mem < 0)
> > > + return;
> > > +
> > > + fdt_for_each_subnode(child, (void *)dtb_pa, reserved_mem) {
> > > + const char *name = fdt_get_name((void *)dtb_pa, child, NULL);
> > > +
> > > + if (!strncmp(name, "mmode_resv", 10))
> > > + fdt_setprop((void *)dtb_pa, child, "no-map", NULL, 0);
> > > + }
> > > +};
> > > +
> > > void __init sbi_init(void)
> > > {
> > > int ret;
> > > @@ -596,8 +632,12 @@ void __init sbi_init(void)
> > > sbi_major_version(), sbi_minor_version());
> > >
> > > if (!sbi_spec_is_0_1()) {
> > > + sbi_firmware_id = sbi_get_firmware_id();
> > > + sbi_firmware_version = sbi_get_firmware_version();
> > > +
> > > pr_info("SBI implementation ID=0x%lx Version=0x%lx\n",
> > > - sbi_get_firmware_id(), sbi_get_firmware_version());
> > > + sbi_firmware_id, sbi_firmware_version);
> > > +
> > > if (sbi_probe_extension(SBI_EXT_TIME)) {
> > > __sbi_set_timer = __sbi_set_timer_v02;
> > > pr_info("SBI TIME extension detected\n");
> > > diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> > > index 70fb31960b63..cb16bfdeacdb 100644
> > > --- a/arch/riscv/mm/init.c
> > > +++ b/arch/riscv/mm/init.c
> > > @@ -29,6 +29,7 @@
> > > #include <asm/tlbflush.h>
> > > #include <asm/sections.h>
> > > #include <asm/soc.h>
> > > +#include <asm/sbi.h>
> > > #include <asm/io.h>
> > > #include <asm/ptdump.h>
> > > #include <asm/numa.h>
> > > @@ -253,6 +254,8 @@ static void __init setup_bootmem(void)
> > > * in the device tree, otherwise the allocation could end up in a
> > > * reserved region.
> > > */
> > > +
> > > + sbi_apply_reserved_mem_erratum(dtb_early_va);
> > > early_init_fdt_scan_reserved_mem();
> > >
> > > /*
> > > --
> > > 2.40.1
> > >