[PATCH V1 0/3] Revert huge-paged linear mapping and its related fixups

From: Song Shuai
Date: Sun Jun 25 2023 - 10:10:54 EST


We have encountered these two issues about huge-paged linear mapping since v6.4-rc1:

1. Bug report: kernel paniced when system hibernates[1]

OpenSbi [v0.8,v1.3) set the PMP regions as !no-map, and the commit 3335068f8721
("riscv: Use PUD/P4D/PGD pages for the linear mapping") mapped them in linear mapping.
The hibernation process attempted to save/restore these mapped regions resulting in access fault.

This issue was temporarily fixed by commit ed309ce52218 ("RISC-V: mark hibernation as nonportable").
But as Alex's RFC and Rob's response stats in another thread [2] ,
"Hibernation is only one case. Speculative accesses could also occur."
So this fixing commit seems not the perfect answer to this issue.


2. Bug report: kernel paniced while booting (with UEFI )[3]

During the booting with UEFI, UEFI Memory Mapping overwrote the memblock.
The phys_ram_base was set as the end address of mmoderes0 (like 0x80040000 for 256 KiB mmoderes0@80000000),
which resulted the VA based on 2M-aligned PA was not 2M-aligned using va_pa_offset
(PAGE_OFFSET - phys_ram_base) to translate.

The best_map_size() from commit 3335068f8721 didn't check the virtual alignment
before choosing a map size. and then a "VA hole" was created where page faults always occurred.

This issue was fixed by commit 49a0a3731596 ("riscv: Check the virtual alignment before choosing a map size"),
But this fixing commit has a side-effect ("the possible third one" as Alex said in this thread).
There are numerous PTE allocations slowing down the boot time and consuming some system memory when UEFI booting
(Note that it's not involved when booting directly with OpenSbi, where phys_ram_base is the 2M-aligned base of DRAM).

In my test, compared with/out reverting both commit 49a0a3731596 and commit 3335068f8721,
I must wait ~20s for the linear mapping creation and mem_init_print_info() reported ~8M extra reserved memory.

To eliminate this side-effect, We should find a way to align VA and PA on a 2MB boundary.
The simplest way is reverting the commit 3335068f8721 ("riscv: Use PUD/P4D/PGD pages for the linear mapping").



Using PUD/P4D/PGD pages for the linear mapping to improve the performance is marginal from a recent talk [4]
from Mike Rapoport. OpenSbi had marked all the PMP-protected regions as "no-map" [5] to practice this talk.

For all those reasons, let's revert these related commits:

- commit 3335068f8721 ("riscv: Use PUD/P4D/PGD pages for the linear mapping")
- commit 49a0a3731596 ("riscv: Check the virtual alignment before choosing a map size")
- commit ed309ce52218 ("RISC-V: mark hibernation as nonportable")

[1]: https://lore.kernel.org/linux-riscv/CAAYs2=gQvkhTeioMmqRDVGjdtNF_vhB+vm_1dHJxPNi75YDQ_Q@xxxxxxxxxxxxxx/
[2]: https://lore.kernel.org/linux-kernel/20230530080425.18612-1-alexghiti@xxxxxxxxxxxx/
[3]: https://lore.kernel.org/linux-riscv/tencent_7C3B580B47C1B17C16488EC1@xxxxxx/
[4]: https://lwn.net/Articles/931406/
[5]: https://github.com/riscv-software-src/opensbi/commit/8153b2622b08802cc542f30a1fcba407a5667ab9

Song Shuai (3):
Revert "RISC-V: mark hibernation as nonportable"
Revert "riscv: Check the virtual alignment before choosing a map size"
Revert "riscv: Use PUD/P4D/PGD pages for the linear mapping"

arch/riscv/Kconfig | 5 +---
arch/riscv/include/asm/page.h | 16 -------------
arch/riscv/mm/init.c | 43 +++++++----------------------------
arch/riscv/mm/physaddr.c | 16 -------------
drivers/of/fdt.c | 11 ++++-----
5 files changed, 14 insertions(+), 77 deletions(-)

--
2.20.1