[PATCHv9 07/17] x86/mm: Return correct level from lookup_address() if pte is none

From: Kirill A. Shutemov
Date: Mon Mar 25 2024 - 10:29:44 EST


Currently, lookup_address() returns two things:
1. A "pte_t" (which might be a p[g4um]d_t)
2. The 'level' of the page tables where the "pte_t" was found
(returned via a pointer)

If no pte_t is found, 'level' is essentially garbage.

Always fill out the level. For NULL "pte_t"s, fill in the level where
the p*d_none() entry was found mirroring the "found" behavior.

Always filling out the level allows using lookup_address() to precisely
skip over holes when walking kernel page tables.

Add one more entry into enum pg_level to indicate the size of the VA
covered by one PGD entry in 5-level paging mode.

Update comments for lookup_address() and lookup_address_in_pgd() to
reflect changes in the interface.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
Reviewed-by: Rick Edgecombe <rick.p.edgecombe@xxxxxxxxx>
Reviewed-by: Baoquan He <bhe@xxxxxxxxxx>
Reviewed-by: Dave Hansen <dave.hansen@xxxxxxxxx>
---
arch/x86/include/asm/pgtable_types.h | 1 +
arch/x86/mm/pat/set_memory.c | 16 ++++++++--------
2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index 0b748ee16b3d..3f648ffdfbe5 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -548,6 +548,7 @@ enum pg_level {
PG_LEVEL_2M,
PG_LEVEL_1G,
PG_LEVEL_512G,
+ PG_LEVEL_256T,
PG_LEVEL_NUM
};

diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index e5b454036bf3..6c49f69c0368 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -657,7 +657,8 @@ static inline pgprot_t verify_rwx(pgprot_t old, pgprot_t new, unsigned long star

/*
* Lookup the page table entry for a virtual address in a specific pgd.
- * Return a pointer to the entry and the level of the mapping.
+ * Return a pointer to the entry (or NULL if the entry does not exist) and
+ * the level of the entry.
*/
pte_t *lookup_address_in_pgd(pgd_t *pgd, unsigned long address,
unsigned int *level)
@@ -666,32 +667,32 @@ pte_t *lookup_address_in_pgd(pgd_t *pgd, unsigned long address,
pud_t *pud;
pmd_t *pmd;

- *level = PG_LEVEL_NONE;
+ *level = PG_LEVEL_256T;

if (pgd_none(*pgd))
return NULL;

+ *level = PG_LEVEL_512G;
p4d = p4d_offset(pgd, address);
if (p4d_none(*p4d))
return NULL;

- *level = PG_LEVEL_512G;
if (p4d_leaf(*p4d) || !p4d_present(*p4d))
return (pte_t *)p4d;

+ *level = PG_LEVEL_1G;
pud = pud_offset(p4d, address);
if (pud_none(*pud))
return NULL;

- *level = PG_LEVEL_1G;
if (pud_leaf(*pud) || !pud_present(*pud))
return (pte_t *)pud;

+ *level = PG_LEVEL_2M;
pmd = pmd_offset(pud, address);
if (pmd_none(*pmd))
return NULL;

- *level = PG_LEVEL_2M;
if (pmd_leaf(*pmd) || !pmd_present(*pmd))
return (pte_t *)pmd;

@@ -704,9 +705,8 @@ pte_t *lookup_address_in_pgd(pgd_t *pgd, unsigned long address,
* Lookup the page table entry for a virtual address. Return a pointer
* to the entry and the level of the mapping.
*
- * Note: We return pud and pmd either when the entry is marked large
- * or when the present bit is not set. Otherwise we would return a
- * pointer to a nonexisting mapping.
+ * Note: the function returns p4d, pud or pmd either when the entry is marked
+ * large or when the present bit is not set. Otherwise it returns NULL.
*/
pte_t *lookup_address(unsigned long address, unsigned int *level)
{
--
2.43.0