Re: [PATCH V1 1/3] Revert "RISC-V: mark hibernation as nonportable"

From: Palmer Dabbelt
Date: Mon Jun 26 2023 - 10:43:53 EST


On Mon, 26 Jun 2023 06:34:43 PDT (-0700), Conor Dooley wrote:
On Sun, Jun 25, 2023 at 03:36:06PM -0700, Palmer Dabbelt wrote:
On Sun, 25 Jun 2023 15:15:14 PDT (-0700), Conor Dooley wrote:
> On Sun, Jun 25, 2023 at 11:09:21PM +0800, Song Shuai wrote:
> > 在 2023/6/25 22:18, Conor Dooley 写道:
> > > On Sun, Jun 25, 2023 at 10:09:29PM +0800, Song Shuai wrote:
> > > > This reverts commit ed309ce522185583b163bd0c74f0d9f299fe1826.
> > > > > > With the commit 3335068f8721 ("riscv: Use PUD/P4D/PGD pages
> > for the
> > > > linear mapping") reverted, the MIN_MEMBLOCK_ADDR points the kernel
> > > > load address which was placed at a PMD boundary.
> > > > > And firmware always
> > > > correctly mark resident memory, or memory protected with PMP as
> > > > per the devicetree specification and/or the UEFI specification.
> > > > But this is not true? The versions of OpenSBI that you mention
> > in your
> > > cover letter do not do this.
> > > Please explain.
> > >
> > > > At this time, OpenSbi [v0.8,v1.3) and edk2(RiscVVirt) indeed don't obey the
> > DT/UEFI spec. This statement is excerpted from "Reserved memory for resident
> > firmware" part from the upcoming riscv/boot.rst. It isn't accurate for now.
> > How about deleting this one?
> > It is incorrect, so it will need to be removed, yes.
> Unfortunately writing a doc does not fix the existing implementations :(
> > > Actually with 3335068f8721 reverted, the change of MIN_MEMBLOCK_ADDR can
> > avoid the mapping of firmware memory, I will make it clear in the next
> > version.
> > To be honest, I'd like to see this revert as the final commit in a
> series that deals with the problem by actually reserving the regions,
> rather than a set of reverts that go back to how we were.
> I was hoping that someone who cares about hibernation support would be
> interested in working on that - *cough* starfive *cough*, although maybe
> they just fixed their OpenSBI and moved on.
> If there were no volunteers, my intention was to add a firmware erratum
> that would probe the SBI implementation & version IDs, and add a firmware
> erratum that'd parse the DT for the offending regions and reserve them.

Is there any actual use case for hibernation on these boards? Maybe it's
simpler to just add a "reserved regions actually work" sort of property and
then have new firmware set it -- that way we can avoid sorting through all
the old stuff nobody cares about and just get on with fixing the stuff
people use.

What is "old stuff nobody cares about"? The first version of OpenSBI with
the fix shipped only the other day, so effectively all current stuff has
this problem. Certainly everything shipping from vendors at the moment
has the problem, and probably whatever downstream, custom versions of
OpenSBI also have it.

Ya, so "old stuff" is everything -- but that's all already broken, so nothing we can do about it. IIUC there's nothing shipping that functions correctly here, so it's just a matter of detecting everything before the bug.

Also, the problem isn't just limited to hibernation apparently. I
think it was mentioned in the cover letter that according to Rob,
without being marked as no-map we could also see speculative access &
potentially some of the memory debugging stuff walking these regions.

We've got a bunch of other problems around speculative accesses to these regions in M-mode, so we'll have to deal with it at some point anyway.

I'm not sure how you'd intend communicating "reserved regions actually
work", I figure you mean via DT?

Somewhere in DT. I hadn't thought about it a ton, just adding some property that says "this doesn't have the bug" was roughly the idea.

I don't really see the benefit of adding a property for those who are
behaving, if we can detect the versions of the one relevant SBI
implementation that are broken at runtime. DT hat on, even less so.
Perhaps I am missing your point, and there's another angle (like trying
to per firmware code)?

If it's easy to figure out which versions are broken that seems fine to me. My worry was just that's hard to do (folks forking OpenSBI) and it might be easier to just