Re: [PATCH 0/2] PUD/PGDIR entries for linear mapping

From: Alex Ghiti
Date: Fri Jun 12 2020 - 09:17:48 EST


Le 6/12/20 Ã 8:59 AM, Alex Ghiti a ÃcritÂ:
Hi Atish,

Le 6/11/20 Ã 1:29 PM, Atish Patra a ÃcritÂ:
On Wed, Jun 10, 2020 at 11:51 PM Alex Ghiti<alex@xxxxxxxx>Â wrote:
Hi Atish,

Le 6/10/20 Ã 2:32 PM, Atish Patra a Ãcrit :
On Wed, Jun 3, 2020 at 8:36 AM Alexandre Ghiti<alex@xxxxxxxx>Â wrote:
This small patchset intends to use PUD/PGDIR entries for linear mapping
in order to better utilize TLB.

At the moment, only PMD entries can be used since on common platforms
(qemu/unleashed), the kernel is loaded at DRAM + 2MB which dealigns virtual
and physical addresses and then prevents the use of PUD/PGDIR entries.
So the kernel must be able to get those 2MB for PAGE_OFFSET to map the
beginning of the DRAM: this is achieved in patch 1.

I don't have in depth knowledge of how mm code works so this question
may be a completely
stupid one :). Just for my understanding,
As per my understanding, kernel will map those 2MB of memory but never use it.
How does the kernel ensure that it doesn't allocate any memory from those 2MB
memory if it is not marked as reserved?
Yes, a 1GB hugepage will cover those 2MB: I rely on the previous boot
stage to mark this region
as reserved if there is something there (like opensbi). Otherwise, the
kernel will indeed try to
allocate memory from there :)

In that case, this patch mandates that the firmware region has to be
mark "reserved"
the device tree so that the Linux kernel doesn't try to allocate
memory from there.
OpenSBI is already doing it from v0.7. Thus, any user using latest
OpenSBI can leverage
this patch for a better TLB utilization.


Note that *currently* OpenSBI v0.7 still adds the "no-map" property which prevents such optimization.

However, legacy previous boot stages(BBL) do not reserve this area via
DT which may
result in an unexpected crash. I am not sure how many developers still
use BBL though.

Few general suggestions to tackle this problem:
1. This mandatory requirement should be added to the booting document
so that any other
SBI implementation is also aware of it.
2. You may have to move the patch1 to a separate config so that any
users of legacy boot stages
can disable this feature.


IMHO, the region occupied by runtime services should be marked as reserved in the device-tree. So it seems redundant to add this as a requirement, I would rather consider its absence as a bug.

Even if I understand that this might break some system, I don't like the idea of a new config to support old "buggy" bootloaders: when will we be able to remove it ? We'll never know when people will stop using those bootloaders, so it will stay here forever...Where can I find the boot document you are talking about ? Can we simply state here that this kernel version will not be compatible with those bootloaders (we'll draw an exhaustive list here) ?


Ok, I have just found Documentation/riscv/boot-image-header.rst: could we imagine doing something like incrementing the version and use that as a hint in the kernel not to map the 2MB offset ? That's still legacy, but at least it does not require to recompile a kernel as the check would be done at runtime.



Alex


Alex


But furthermore, at the moment, the firmware (opensbi) explicitly asks the
kernel not to map the region it occupies, which is on those common
platforms at the very beginning of the DRAM and then it also dealigns
virtual and physical addresses. I proposed a patch here:

https://github.com/riscv/opensbi/pull/167

that removes this 'constraint' but *not* all the time as it offers some
kind of protection in case PMP is not available. So sometimes, we may
have a part of the memory below the kernel that is removed creating a
misalignment between virtual and physical addresses. So for performance
reasons, we must at least make sure that PMD entries can be used: that
is guaranteed by patch 1 too.

Finally the second patch simply improves best_map_size so that whenever
possible, PUD/PGDIR entries are used.

Below is the kernel page table without this patch on a 6G platform:

---[ Linear mapping ]---
0xffffc00000000000-0xffffc00176e00000 0x0000000080200000 5998M PMDÂÂÂÂ D A . . . W R V

And with this patchset + opensbi patch:

---[ Linear mapping ]---
0xffffc00000000000-0xffffc00140000000 0x0000000080000000ÂÂÂÂÂÂÂÂ 5G PUDÂÂÂÂ D A . . . W R V
0xffffc00140000000-0xffffc00177000000 0x00000001c0000000 880M PMDÂÂÂÂ D A . . . W R V

Alexandre Ghiti (2):
ÂÂÂ riscv: Get memory below load_pa while ensuring linear mapping is PMD
ÂÂÂÂÂ aligned
ÂÂÂ riscv: Use PUD/PGDIR entries for linear mapping when possible

ÂÂ arch/riscv/include/asm/page.h |Â 8 ++++
ÂÂ arch/riscv/mm/init.cÂÂÂÂÂÂÂÂÂ | 69 +++++++++++++++++++++++++++++------
ÂÂ 2 files changed, 65 insertions(+), 12 deletions(-)

--
2.20.1