[RFC PATCH] arm64: mm: Fix kernel page tables incorrectly deleted during memory removal

From: Wupeng Ma
Date: Mon Jul 17 2023 - 07:52:02 EST


From: Ma Wupeng <mawupeng1@xxxxxxxxxx>

During our test, we found that kernel page table may be unexpectedly
cleared with rodata off. The root cause is that the kernel page is
initialized with pud size(1G block mapping) while offline is memory
block size(MIN_MEMORY_BLOCK_SIZE 128M), eg, if 2G memory is hot-added,
when offline a memory block, the call trace is shown below,

offline_and_remove_memory
try_remove_memory
arch_remove_memory
__remove_pgd_mapping
unmap_hotplug_range
unmap_hotplug_p4d_range
unmap_hotplug_pud_range
if (pud_sect(pud))
pud_clear(pudp);

There is no issue for block mapping with pmd level(2M) because the
memory block size is aligned with 2M.

Commit f0b13ee23241 ("arm64/sparsemem: reduce SECTION_SIZE_BITS") reduces
SECTION_SIZE_BITS from arm64, this make memory section size less than pud
size possible. Since only hotadded memory can be removed for arm64 due to
commit bbd6ec605c0f ("arm64/mm: Enable memory hot remove"), stop using pud
size kernel page entry during memory hot join can fix this.

Fixes: f0b13ee23241 ("arm64/sparsemem: reduce SECTION_SIZE_BITS")
Signed-off-by: Ma Wupeng <mawupeng1@xxxxxxxxxx>
---
arch/arm64/mm/mmu.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 95d360805f8a..44c724ce4f70 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -44,6 +44,7 @@
#define NO_BLOCK_MAPPINGS BIT(0)
#define NO_CONT_MAPPINGS BIT(1)
#define NO_EXEC_MAPPINGS BIT(2) /* assumes FEAT_HPDS is not used */
+#define NO_PUD_MAPPINGS BIT(3)

int idmap_t0sz __ro_after_init;

@@ -344,7 +345,7 @@ static void alloc_init_pud(pgd_t *pgdp, unsigned long addr, unsigned long end,
*/
if (pud_sect_supported() &&
((addr | next | phys) & ~PUD_MASK) == 0 &&
- (flags & NO_BLOCK_MAPPINGS) == 0) {
+ (flags & (NO_BLOCK_MAPPINGS | NO_PUD_MAPPINGS)) == 0) {
pud_set_huge(pudp, phys, prot);

/*
@@ -1305,7 +1306,7 @@ struct range arch_get_mappable_range(void)
int arch_add_memory(int nid, u64 start, u64 size,
struct mhp_params *params)
{
- int ret, flags = NO_EXEC_MAPPINGS;
+ int ret, flags = NO_EXEC_MAPPINGS | NO_PUD_MAPPINGS;

VM_BUG_ON(!mhp_range_allowed(start, size, true));

--
2.25.1