Re: [RFT 1/2] RISC-V: handle missing "no-map" properties for OpenSBI's PMP protected regions

From: Conor Dooley
Date: Thu Aug 10 2023 - 05:08:39 EST


On Wed, Aug 09, 2023 at 02:01:07AM -0700, Atish Kumar Patra wrote:
> On Tue, Aug 8, 2023 at 6:39 AM Conor Dooley <conor@xxxxxxxxxx> wrote:
> >
> > On Tue, Aug 08, 2023 at 12:54:11AM -0700, Atish Kumar Patra wrote:
> > > On Wed, Aug 2, 2023 at 4:14 AM Conor Dooley <conor.dooley@xxxxxxxxxxxxx> wrote:
> > > >
> > > > Add an erratum for versions [v0.8 to v1.3) of OpenSBI which fail to add
> > > > the "no-map" property to the reserved memory nodes for the regions it
> > > > has protected using PMPs.
> > > >
> > > > Our existing fix sweeping hibernation under the carpet by marking it
> > > > NONPORTABLE is insufficient as there are other ways to generate
> > > > accesses to these reserved memory regions, as Petr discovered [1]
> > > > while testing crash kernels & kdump.
> > > >
> > > > Intercede during the boot process when the afflicted versions of OpenSBI
> > > > are present & set the "no-map" property in all "mmode_resv" nodes before
> > > > the kernel does its reserved memory region initialisation.
> > > >
> > >
> > > We have different mechanisms of DT being passed to the kernel.
> > >
> > > 1. A prior stage(e.g U-Boot SPL) to M-mode runtime firmware (e.g.
> > > OpenSBI, rustSBI) passes the DT to M-mode runtime firmware and it
> > > passes to the next stage.
> > > In this case, M-mode runtime firmware gets a chance to update the
> > > no-map property in DT that the kernel can parse.
> > >
> > > 2. User loads the DT from the boot loader (e.g EDK2, U-Boot proper).
> > > Any DT patching done by the M-mode firmware is useless. If these DTBs
> > > don't have the no-map
> > > property, hibernation or EFI booting will have issues as well.
> > >
> >
> > > We are trying to solve only one part of problem #1 in this patch.
> >
> > Correct.
> >
> > If someone's second stage is also providing an incorrect devicetree
> > then, yeah, this approach would fall apart - but it's the firmware
> > provided devicetree being incorrect that I am trying to account for
> > here. If a person incorrectly constructed one, I am not really sure what
> > we can do for them, they incorrect described their hardware /shrug
> > My patch should of course help in some of the scenarios you mention above
> > if the name of the reserved memory region from OpenSBI is propagated by
> > the second-stage bootloader, but that is just an extension of case 1,
> > not case 2.
> >
> > > I
> > > don't think any other M-mode runtime firmware patches DT with no-map
> > > property as well.
> > > Please let me know if I am wrong about that. The problem is not
> > > restricted to [v0.8 to v1.3) of OpenSBI.
> >
> > It comes down to Alex's question - do we want to fix this kind of
> > firmware issue in the kernel? Ultimately this is a policy decision that
> > "somebody" has to make. Maybe the list of firmwares that need this
>
> IMO, we shouldn't as this is a slippery slope. Kernel can't fix every
> firmware bug by having erratas.
> I agree with your point below about firmware in shipping products. I
> am not aware of any official products shipping anything other than
> OpenSBI either.

> However, I have seen users using other firmwares in their dev
> environment.

If someone's already changed their boards firmware, I have less sympathy
for them, as they should be able to make further changes. Punters buying
SBCs to install Fedora or Debian w/o having to consider their firmware
are who I am more interested in helping.

> IMHO, this approach sets a bad precedent for the future especially
> when it only solves one part of the problem.

Yeah, I'm certainly wary of setting an unwise precedent here.
Inevitably we will need to have firmware-related errata and it'd be good
to have a policy for what is (or more importantly what isn't
acceptable). Certainly we have said that known-broken version of OpenSBI
that T-Head puts in their SDK is not supported by the mainline kernel.
On the latter part, I'm perfectly happy to expand the erratum to cover
all affected firmwares, but I wasn't even sure if my fix worked
properly, hence the request for testing from those who encountered the
problem.

> We shouldn't hide firmware bugs in the kernel when an upgraded
> firmware is already available.

Just to note, availability of an updated firmware upstream does not
necessarily mean that corresponding update is possible for affected
hardware.

> This bug is well documented in various threads and fixed in the latest
> version of OpenSBI.
> I am assuming other firmwares will follow it as well.
>
> Anybody facing hibernation or efi related booting issues should just
> upgrade to the latest version of firmware (e.g OpenSBI v1.3)
> Latest version of Qemu will support(if not happened already) the
> latest version of OpenSBI.
>
> This issue will only manifest in kernels 6.4 or higher. Any user
> facing these with the latest kernel can also upgrade the firmware.
> Do you see any issue with that ?

I don't think it is fair to compare the ease of upgrading the kernel
to that required to upgrade a boards firmware, with the latter being
far, far more inconvenient on pretty much all of the boards that I have.

I'm perfectly happy to drop this series though, if people generally are
of the opinion that this sort of firmware workaround is ill-advised.
We are unaffected by it, so I certainly have no pressure to have
something working here. It's my desire not to be user-hostile that
motivated this patch.

Attachment: signature.asc
Description: PGP signature