Re: [PATCH V7] mm/debug: Add tests validating architecture page table helpers

From: Anshuman Khandual
Date: Tue Oct 22 2019 - 07:18:45 EST



On 10/22/2019 12:41 PM, Christophe Leroy wrote:
>
>
> On 10/21/2019 02:42 AM, Anshuman Khandual wrote:
>> This adds tests which will validate architecture page table helpers and
>> other accessors in their compliance with expected generic MM semantics.
>> This will help various architectures in validating changes to existing
>> page table helpers or addition of new ones.
>>
>> This test covers basic page table entry transformations including but not
>> limited to old, young, dirty, clean, write, write protect etc at various
>> level along with populating intermediate entries with next page table page
>> and validating them.
>>
>> Test page table pages are allocated from system memory with required size
>> and alignments. The mapped pfns at page table levels are derived from a
>> real pfn representing a valid kernel text symbol. This test gets called
>> right after page_alloc_init_late().
>>
>> This gets build and run when CONFIG_DEBUG_VM_PGTABLE is selected along with
>> CONFIG_VM_DEBUG. Architectures willing to subscribe this test also need to
>> select CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE which for now is limited to x86 and
>> arm64. Going forward, other architectures too can enable this after fixing
>> build or runtime problems (if any) with their page table helpers.
>>
>> Folks interested in making sure that a given platform's page table helpers
>> conform to expected generic MM semantics should enable the above config
>> which will just trigger this test during boot. Any non conformity here will
>> be reported as an warning which would need to be fixed. This test will help
>> catch any changes to the agreed upon semantics expected from generic MM and
>> enable platforms to accommodate it thereafter.
>>
>> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
>> Cc: Vlastimil Babka <vbabka@xxxxxxx>
>> Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
>> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
>> Cc: Mike Rapoport <rppt@xxxxxxxxxxxxxxxxxx>
>> Cc: Jason Gunthorpe <jgg@xxxxxxxx>
>> Cc: Dan Williams <dan.j.williams@xxxxxxxxx>
>> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>> Cc: Michal Hocko <mhocko@xxxxxxxxxx>
>> Cc: Mark Rutland <mark.rutland@xxxxxxx>
>> Cc: Mark Brown <broonie@xxxxxxxxxx>
>> Cc: Steven Price <Steven.Price@xxxxxxx>
>> Cc: Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx>
>> Cc: Masahiro Yamada <yamada.masahiro@xxxxxxxxxxxxx>
>> Cc: Kees Cook <keescook@xxxxxxxxxxxx>
>> Cc: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
>> Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx>
>> Cc: Sri Krishna chowdary <schowdary@xxxxxxxxxx>
>> Cc: Dave Hansen <dave.hansen@xxxxxxxxx>
>> Cc: Russell King - ARM Linux <linux@xxxxxxxxxxxxxxx>
>> Cc: Michael Ellerman <mpe@xxxxxxxxxxxxxx>
>> Cc: Paul Mackerras <paulus@xxxxxxxxx>
>> Cc: Martin Schwidefsky <schwidefsky@xxxxxxxxxx>
>> Cc: Heiko Carstens <heiko.carstens@xxxxxxxxxx>
>> Cc: "David S. Miller" <davem@xxxxxxxxxxxxx>
>> Cc: Vineet Gupta <vgupta@xxxxxxxxxxxx>
>> Cc: James Hogan <jhogan@xxxxxxxxxx>
>> Cc: Paul Burton <paul.burton@xxxxxxxx>
>> Cc: Ralf Baechle <ralf@xxxxxxxxxxxxxx>
>> Cc: Kirill A. Shutemov <kirill@xxxxxxxxxxxxx>
>> Cc: Gerald Schaefer <gerald.schaefer@xxxxxxxxxx>
>> Cc: Christophe Leroy <christophe.leroy@xxxxxx>
>> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
>> Cc: linux-snps-arc@xxxxxxxxxxxxxxxxxxx
>> Cc: linux-mips@xxxxxxxxxxxxxxx
>> Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
>> Cc: linux-ia64@xxxxxxxxxxxxxxx
>> Cc: linuxppc-dev@xxxxxxxxxxxxxxxx
>> Cc: linux-s390@xxxxxxxxxxxxxxx
>> Cc: linux-sh@xxxxxxxxxxxxxxx
>> Cc: sparclinux@xxxxxxxxxxxxxxx
>> Cc: x86@xxxxxxxxxx
>> Cc: linux-kernel@xxxxxxxxxxxxxxx
>>
>> Tested-by: Christophe Leroy <christophe.leroy@xxxxxx>ÂÂÂÂÂÂÂ #PPC32
>> Suggested-by: Catalin Marinas <catalin.marinas@xxxxxxx>
>> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
>> Signed-off-by: Christophe Leroy <christophe.leroy@xxxxxx>
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@xxxxxxx>
>> ---
>
> The cover letter have the exact same title as this patch. I think a cover letter is not necessary for a singleton series.

Right, but it became singleton series in this version :)

>
> The history (and any other information you don't want to include in the commit message) can be added here, below the '---'. That way it is in the mail but won't be included in the commit.
I was aware about that but the change log here was big, hence just choose to have that
separately in a cover letter. As you said, I guess the cover letter is probably not
required anymore. Will add it here in the patch, next time around.

>
>> Â .../debug/debug-vm-pgtable/arch-support.txtÂÂÂÂÂÂÂ |Â 34 ++
>> Â arch/arm64/KconfigÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂ 1 +
>> Â arch/x86/KconfigÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂ 1 +
>> Â arch/x86/include/asm/pgtable_64.hÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂ 6 +
>> Â include/asm-generic/pgtable.hÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂ 6 +
>> Â init/main.cÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂ 1 +
>> Â lib/Kconfig.debugÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |Â 21 ++
>> Â mm/MakefileÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂ 1 +
>> Â mm/debug_vm_pgtable.cÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | 388 +++++++++++++++++++++
>> Â 9 files changed, 459 insertions(+)
>> Â create mode 100644 Documentation/features/debug/debug-vm-pgtable/arch-support.txt
>> Â create mode 100644 mm/debug_vm_pgtable.c
>>
>> diff --git a/Documentation/features/debug/debug-vm-pgtable/arch-support.txt b/Documentation/features/debug/debug-vm-pgtable/arch-support.txt
>> new file mode 100644
>> index 0000000..d6b8185
>> --- /dev/null
>> +++ b/Documentation/features/debug/debug-vm-pgtable/arch-support.txt
>> @@ -0,0 +1,34 @@
>> +#
>> +# Feature name:ÂÂÂÂÂÂÂÂÂ debug-vm-pgtable
>> +#ÂÂÂÂÂÂÂÂ Kconfig:ÂÂÂÂÂÂ ARCH_HAS_DEBUG_VM_PGTABLE
>> +#ÂÂÂÂÂÂÂÂ description:ÂÂ arch supports pgtable tests for semantics compliance
>> +#
>> +ÂÂÂ -----------------------
>> +ÂÂÂ |ÂÂÂÂÂÂÂÂ arch |status|
>> +ÂÂÂ -----------------------
>> +ÂÂÂ |ÂÂÂÂÂÂ alpha: | TODO |
>> +ÂÂÂ |ÂÂÂÂÂÂÂÂ arc: | TODO |
>> +ÂÂÂ |ÂÂÂÂÂÂÂÂ arm: | TODO |
>> + | arm64: | ok |
>> +ÂÂÂ |ÂÂÂÂÂÂÂÂ c6x: | TODO |
>> +ÂÂÂ |ÂÂÂÂÂÂÂ csky: | TODO |
>> +ÂÂÂ |ÂÂÂÂÂÂ h8300: | TODO |
>> +ÂÂÂ |ÂÂÂÂ hexagon: | TODO |
>> +ÂÂÂ |ÂÂÂÂÂÂÂ ia64: | TODO |
>> +ÂÂÂ |ÂÂÂÂÂÂÂ m68k: | TODO |
>> +ÂÂÂ |Â microblaze: | TODO |
>> +ÂÂÂ |ÂÂÂÂÂÂÂ mips: | TODO |
>> +ÂÂÂ |ÂÂÂÂÂÂ nds32: | TODO |
>> +ÂÂÂ |ÂÂÂÂÂÂ nios2: | TODO |
>> +ÂÂÂ |ÂÂÂ openrisc: | TODO |
>> +ÂÂÂ |ÂÂÂÂÂ parisc: | TODO |
>> +ÂÂÂ |ÂÂÂÂ powerpc: | TODO |
>
> Say ok on ppc32

Will do.

>
>> +ÂÂÂ |ÂÂÂÂÂÂ riscv: | TODO |
>> +ÂÂÂ |ÂÂÂÂÂÂÂ s390: | TODO |
>> +ÂÂÂ |ÂÂÂÂÂÂÂÂÂ sh: | TODO |
>> +ÂÂÂ |ÂÂÂÂÂÂ sparc: | TODO |
>> +ÂÂÂ |ÂÂÂÂÂÂÂÂÂ um: | TODO |
>> +ÂÂÂ |ÂÂ unicore32: | TODO |
>> + | x86: | ok |
>> +ÂÂÂ |ÂÂÂÂÂ xtensa: | TODO |
>> +ÂÂÂ -----------------------
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index 1b6ea5a..ea62c87 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -11,6 +11,7 @@ config ARM64
>> ÂÂÂÂÂ select ACPI_PPTT if ACPI
>> ÂÂÂÂÂ select ARCH_CLOCKSOURCE_DATA
>> ÂÂÂÂÂ select ARCH_HAS_DEBUG_VIRTUAL
>> +ÂÂÂ select ARCH_HAS_DEBUG_VM_PGTABLE
>> ÂÂÂÂÂ select ARCH_HAS_DEVMEM_IS_ALLOWED
>> ÂÂÂÂÂ select ARCH_HAS_DMA_COHERENT_TO_PFN
>> ÂÂÂÂÂ select ARCH_HAS_DMA_PREP_COHERENT
>> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
>> index abe822d..13c9bd9 100644
>> --- a/arch/x86/Kconfig
>> +++ b/arch/x86/Kconfig
>> @@ -61,6 +61,7 @@ config X86
>> ÂÂÂÂÂ select ARCH_CLOCKSOURCE_INIT
>> ÂÂÂÂÂ select ARCH_HAS_ACPI_TABLE_UPGRADEÂÂÂ if ACPI
>> ÂÂÂÂÂ select ARCH_HAS_DEBUG_VIRTUAL
>> +ÂÂÂ select ARCH_HAS_DEBUG_VM_PGTABLE
>> ÂÂÂÂÂ select ARCH_HAS_DEVMEM_IS_ALLOWED
>> ÂÂÂÂÂ select ARCH_HAS_ELF_RANDOMIZE
>> ÂÂÂÂÂ select ARCH_HAS_FAST_MULTIPLIER
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 3e56c9c2f16e..c50d7cfa566b 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -120,6 +120,7 @@ config PPC
> ÂÂÂÂ #
> ÂÂÂÂ select ARCH_32BIT_OFF_T if PPC32
> ÂÂÂÂ select ARCH_HAS_DEBUG_VIRTUAL
> +ÂÂÂ select ARCH_HAS_DEBUG_VM_PGTABLE if PPC32
> ÂÂÂÂ select ARCH_HAS_DEVMEM_IS_ALLOWED
> ÂÂÂÂ select ARCH_HAS_ELF_RANDOMIZE
> ÂÂÂÂ select ARCH_HAS_FORTIFY_SOURCE
>
>

Will add this.

>> diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h
>> index 0b6c4042..fb0e76d 100644
>> --- a/arch/x86/include/asm/pgtable_64.h
>> +++ b/arch/x86/include/asm/pgtable_64.h
>> @@ -53,6 +53,12 @@ static inline void sync_initial_page_table(void) { }
>> Â Â struct mm_struct;
>> Â +#define mm_p4d_folded mm_p4d_folded
>> +static inline bool mm_p4d_folded(struct mm_struct *mm)
>> +{
>> +ÂÂÂ return !pgtable_l5_enabled();
>> +}
>> +
>> Â void set_pte_vaddr_p4d(p4d_t *p4d_page, unsigned long vaddr, pte_t new_pte);
>> Â void set_pte_vaddr_pud(pud_t *pud_page, unsigned long vaddr, pte_t new_pte);
>> Â diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
>> index 9cdcbc7..9eb02e1 100644
>> --- a/include/asm-generic/pgtable.h
>> +++ b/include/asm-generic/pgtable.h
>> @@ -1168,6 +1168,12 @@ static inline bool arch_has_pfn_modify_check(void)
>> Â # define PAGE_KERNEL_EXEC PAGE_KERNEL
>> Â #endif
>> Â +#ifdef CONFIG_DEBUG_VM_PGTABLE
>> +extern void debug_vm_pgtable(void);
>> +#else
>> +static inline void debug_vm_pgtable(void) { }
>> +#endif
>> +
>> Â #endif /* !__ASSEMBLY__ */
>> Â Â #ifndef io_remap_pfn_range
>> diff --git a/init/main.c b/init/main.c
>> index 91f6ebb..af8379e 100644
>> --- a/init/main.c
>> +++ b/init/main.c
>> @@ -1185,6 +1185,7 @@ static noinline void __init kernel_init_freeable(void)
>> ÂÂÂÂÂ sched_init_smp();
>> Â ÂÂÂÂÂ page_alloc_init_late();
>> +ÂÂÂ debug_vm_pgtable();
>> ÂÂÂÂÂ /* Initialize page ext after all struct pages are initialized. */
>> ÂÂÂÂÂ page_ext_init();
>> Â diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
>> index 9c60d7d..cf48d95 100644
>> --- a/lib/Kconfig.debug
>> +++ b/lib/Kconfig.debug
>> @@ -690,6 +690,27 @@ config DEBUG_VM_PGFLAGS
>> Â ÂÂÂÂÂÂÂ If unsure, say N.
>> Â +config ARCH_HAS_DEBUG_VM_PGTABLE
>> +ÂÂÂ bool
>> +ÂÂÂ help
>> +ÂÂÂÂÂ An architecture should select this when it can successfully
>> +ÂÂÂÂÂ build and run DEBUG_VM_PGTABLE.
>> +
>> +config DEBUG_VM_PGTABLE
>> +ÂÂÂ bool "Debug arch page table for semantics compliance"
>> +ÂÂÂ depends on MMU
>> +ÂÂÂ depends on DEBUG_VM
>> +ÂÂÂ depends on ARCH_HAS_DEBUG_VM_PGTABLE
>> +ÂÂÂ help
>> +ÂÂÂÂÂ This option provides a debug method which can be used to test
>> +ÂÂÂÂÂ architecture page table helper functions on various platforms in
>> +ÂÂÂÂÂ verifying if they comply with expected generic MM semantics. This
>> +ÂÂÂÂÂ will help architecture code in making sure that any changes or
>> +ÂÂÂÂÂ new additions of these helpers still conform to expected
>> +ÂÂÂÂÂ semantics of the generic MM.
>> +
>> +ÂÂÂÂÂ If unsure, say N.
>> +
>
> Would be nice to have that one also indented like the other DEBUG_VM_XXXXX (see below).
>
> Stack utilization instrumentation (DEBUG_STACK_USAGE) [N/y/?] n
> Debug VM (DEBUG_VM) [N/y/?] (NEW) y
> Â Debug VMA caching (DEBUG_VM_VMACACHE) [N/y/?] (NEW)
> Â Debug VM red-black trees (DEBUG_VM_RB) [N/y/?] (NEW)
> Â Debug page-flags operations (DEBUG_VM_PGFLAGS) [N/y/?] (NEW)
> Debug arch page table for semantics compliance (DEBUG_VM_PGTABLE) [N/y/?] (NEW)
> Debug VM translations (DEBUG_VIRTUAL) [N/y/?] n
>
>
> For that, just move config ARCH_HAS_DEBUG_VM_PGTABLE somewhere else, maybe before DEBUG_VM or just after DEBUG_VM_PGTABLE

Initially I had ARCH_HAS_DEBUG_VM_PGTABLE after DEBUG_VM_PGTABLE but reversed
that because of it's dependency. So will probably move it before DEBUG_VM.

>
>
>> Â config ARCH_HAS_DEBUG_VIRTUAL
>> ÂÂÂÂÂ bool
>> Â diff --git a/mm/Makefile b/mm/Makefile
>> index d996846..2f085b9 100644
>> --- a/mm/Makefile
>> +++ b/mm/Makefile
>> @@ -86,6 +86,7 @@ obj-$(CONFIG_HWPOISON_INJECT) += hwpoison-inject.o
>> Â obj-$(CONFIG_DEBUG_KMEMLEAK) += kmemleak.o
>> Â obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak-test.o
>> Â obj-$(CONFIG_DEBUG_RODATA_TEST) += rodata_test.o
>> +obj-$(CONFIG_DEBUG_VM_PGTABLE) += debug_vm_pgtable.o
>> Â obj-$(CONFIG_PAGE_OWNER) += page_owner.o
>> Â obj-$(CONFIG_CLEANCACHE) += cleancache.o
>> Â obj-$(CONFIG_MEMORY_ISOLATION) += page_isolation.o
>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
>> new file mode 100644
>> index 0000000..9472566
>> --- /dev/null
>> +++ b/mm/debug_vm_pgtable.c
>> @@ -0,0 +1,388 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * This kernel test validates architecture page table helpers and
>> + * accessors and helps in verifying their continued compliance with
>> + * expected generic MM semantics.
>> + *
>> + * Copyright (C) 2019 ARM Ltd.
>> + *
>> + * Author: Anshuman Khandual <anshuman.khandual@xxxxxxx>
>> + */
>> +#define pr_fmt(fmt) "arch_pgtable_test: %s " fmt, __func__
>> +
>> +#include <linux/gfp.h>
>> +#include <linux/highmem.h>
>> +#include <linux/hugetlb.h>
>> +#include <linux/kernel.h>
>> +#include <linux/kconfig.h>
>> +#include <linux/mm.h>
>> +#include <linux/mman.h>
>> +#include <linux/mm_types.h>
>> +#include <linux/module.h>
>> +#include <linux/pfn_t.h>
>> +#include <linux/printk.h>
>> +#include <linux/random.h>
>> +#include <linux/spinlock.h>
>> +#include <linux/swap.h>
>> +#include <linux/swapops.h>
>> +#include <linux/start_kernel.h>
>> +#include <linux/sched/mm.h>
>> +#include <asm/pgalloc.h>
>> +#include <asm/pgtable.h>
>> +
>> +/*
>> + * Basic operations
>> + *
>> + * mkold(entry)ÂÂÂÂÂÂÂÂÂÂÂ = An old and not a young entry
>> + * mkyoung(entry)ÂÂÂÂÂÂÂ = A young and not an old entry
>> + * mkdirty(entry)ÂÂÂÂÂÂÂ = A dirty and not a clean entry
>> + * mkclean(entry)ÂÂÂÂÂÂÂ = A clean and not a dirty entry
>> + * mkwrite(entry)ÂÂÂÂÂÂÂ = A write and not a write protected entry
>> + * wrprotect(entry)ÂÂÂÂÂÂÂ = A write protected and not a write entry
>> + * pxx_bad(entry)ÂÂÂÂÂÂÂ = A mapped and non-table entry
>> + * pxx_same(entry1, entry2)ÂÂÂ = Both entries hold the exact same value
>> + */
>> +#define VMFLAGSÂÂÂ (VM_READ|VM_WRITE|VM_EXEC)
>> +
>> +/*
>> + * On s390 platform, the lower 12 bits are used to identify given page table
>> + * entry type and for other arch specific requirements. But these bits might
>> + * affect the ability to clear entries with pxx_clear(). So while loading up
>> + * the entries skip all lower 12 bits in order to accommodate s390 platform.
>> + * It does not have affect any other platform.
>> + */
>> +#define RANDOM_ORVALUEÂÂÂ (0xfffffffffffff000UL)
>> +#define RANDOM_NZVALUEÂÂÂ (0xff)
>> +
>> +static void __init pte_basic_tests(unsigned long pfn, pgprot_t prot)
>> +{
>> +ÂÂÂ pte_t pte = pfn_pte(pfn, prot);
>> +
>> +ÂÂÂ WARN_ON(!pte_same(pte, pte));
>> +ÂÂÂ WARN_ON(!pte_young(pte_mkyoung(pte)));
>> +ÂÂÂ WARN_ON(!pte_dirty(pte_mkdirty(pte)));
>> +ÂÂÂ WARN_ON(!pte_write(pte_mkwrite(pte)));
>> +ÂÂÂ WARN_ON(pte_young(pte_mkold(pte)));
>> +ÂÂÂ WARN_ON(pte_dirty(pte_mkclean(pte)));
>> +ÂÂÂ WARN_ON(pte_write(pte_wrprotect(pte)));
>> +}
>> +
>> +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE
>> +static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot)
>> +{
>> +ÂÂÂ pmd_t pmd = pfn_pmd(pfn, prot);
>> +
>> +ÂÂÂ WARN_ON(!pmd_same(pmd, pmd));
>> +ÂÂÂ WARN_ON(!pmd_young(pmd_mkyoung(pmd)));
>> +ÂÂÂ WARN_ON(!pmd_dirty(pmd_mkdirty(pmd)));
>> +ÂÂÂ WARN_ON(!pmd_write(pmd_mkwrite(pmd)));
>> +ÂÂÂ WARN_ON(pmd_young(pmd_mkold(pmd)));
>> +ÂÂÂ WARN_ON(pmd_dirty(pmd_mkclean(pmd)));
>> +ÂÂÂ WARN_ON(pmd_write(pmd_wrprotect(pmd)));
>> +ÂÂÂ /*
>> +ÂÂÂÂ * A huge page does not point to next level page table
>> +ÂÂÂÂ * entry. Hence this must qualify as pmd_bad().
>> +ÂÂÂÂ */
>> +ÂÂÂ WARN_ON(!pmd_bad(pmd_mkhuge(pmd)));
>> +}
>> +#else
>> +static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot) { }
>> +#endif
>> +
>> +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
>> +static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot)
>> +{
>> +ÂÂÂ pud_t pud = pfn_pud(pfn, prot);
>> +
>> +ÂÂÂ WARN_ON(!pud_same(pud, pud));
>> +ÂÂÂ WARN_ON(!pud_young(pud_mkyoung(pud)));
>> +ÂÂÂ WARN_ON(!pud_write(pud_mkwrite(pud)));
>> +ÂÂÂ WARN_ON(pud_write(pud_wrprotect(pud)));
>> +ÂÂÂ WARN_ON(pud_young(pud_mkold(pud)));
>> +
>> +ÂÂÂ if (mm_pmd_folded(mm) || __is_defined(ARCH_HAS_4LEVEL_HACK))
>> +ÂÂÂÂÂÂÂ return;
>> +
>> +ÂÂÂ /*
>> +ÂÂÂÂ * A huge page does not point to next level page table
>> +ÂÂÂÂ * entry. Hence this must qualify as pud_bad().
>> +ÂÂÂÂ */
>> +ÂÂÂ WARN_ON(!pud_bad(pud_mkhuge(pud)));
>> +}
>> +#else
>> +static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot) { }
>> +#endif
>> +
>> +static void __init p4d_basic_tests(unsigned long pfn, pgprot_t prot)
>> +{
>> +ÂÂÂ p4d_t p4d;
>> +
>> +ÂÂÂ memset(&p4d, RANDOM_NZVALUE, sizeof(p4d_t));
>> +ÂÂÂ WARN_ON(!p4d_same(p4d, p4d));
>> +}
>> +
>> +static void __init pgd_basic_tests(unsigned long pfn, pgprot_t prot)
>> +{
>> +ÂÂÂ pgd_t pgd;
>> +
>> +ÂÂÂ memset(&pgd, RANDOM_NZVALUE, sizeof(pgd_t));
>> +ÂÂÂ WARN_ON(!pgd_same(pgd, pgd));
>> +}
>> +
>> +#ifndef __ARCH_HAS_4LEVEL_HACK
>> +static void __init pud_clear_tests(struct mm_struct *mm, pud_t *pudp)
>> +{
>> +ÂÂÂ pud_t pud = READ_ONCE(*pudp);
>> +
>> +ÂÂÂ if (mm_pmd_folded(mm))
>> +ÂÂÂÂÂÂÂ return;
>> +
>> +ÂÂÂ pud = __pud(pud_val(pud) | RANDOM_ORVALUE);
>> +ÂÂÂ WRITE_ONCE(*pudp, pud);
>> +ÂÂÂ pud_clear(pudp);
>> +ÂÂÂ pud = READ_ONCE(*pudp);
>> +ÂÂÂ WARN_ON(!pud_none(pud));
>> +}
>> +
>> +static void __init pud_populate_tests(struct mm_struct *mm, pud_t *pudp,
>> +ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ pmd_t *pmdp)
>> +{
>> +ÂÂÂ pud_t pud;
>> +
>> +ÂÂÂ if (mm_pmd_folded(mm))
>> +ÂÂÂÂÂÂÂ return;
>> +ÂÂÂ /*
>> +ÂÂÂÂ * This entry points to next level page table page.
>> +ÂÂÂÂ * Hence this must not qualify as pud_bad().
>> +ÂÂÂÂ */
>> +ÂÂÂ pmd_clear(pmdp);
>> +ÂÂÂ pud_clear(pudp);
>> +ÂÂÂ pud_populate(mm, pudp, pmdp);
>> +ÂÂÂ pud = READ_ONCE(*pudp);
>> +ÂÂÂ WARN_ON(pud_bad(pud));
>> +}
>> +#else
>> +static void __init pud_clear_tests(struct mm_struct *mm, pud_t *pudp) { }
>> +static void __init pud_populate_tests(struct mm_struct *mm, pud_t *pudp,
>> +ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ pmd_t *pmdp)
>> +{
>> +}
>> +#endif
>> +
>> +#ifndef __ARCH_HAS_5LEVEL_HACK
>> +static void __init p4d_clear_tests(struct mm_struct *mm, p4d_t *p4dp)
>> +{
>> +ÂÂÂ p4d_t p4d = READ_ONCE(*p4dp);
>> +
>> +ÂÂÂ if (mm_pud_folded(mm))
>> +ÂÂÂÂÂÂÂ return;
>> +
>> +ÂÂÂ p4d = __p4d(p4d_val(p4d) | RANDOM_ORVALUE);
>> +ÂÂÂ WRITE_ONCE(*p4dp, p4d);
>> +ÂÂÂ p4d_clear(p4dp);
>> +ÂÂÂ p4d = READ_ONCE(*p4dp);
>> +ÂÂÂ WARN_ON(!p4d_none(p4d));
>> +}
>> +
>> +static void __init p4d_populate_tests(struct mm_struct *mm, p4d_t *p4dp,
>> +ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ pud_t *pudp)
>> +{
>> +ÂÂÂ p4d_t p4d;
>> +
>> +ÂÂÂ if (mm_pud_folded(mm))
>> +ÂÂÂÂÂÂÂ return;
>> +
>> +ÂÂÂ /*
>> +ÂÂÂÂ * This entry points to next level page table page.
>> +ÂÂÂÂ * Hence this must not qualify as p4d_bad().
>> +ÂÂÂÂ */
>> +ÂÂÂ pud_clear(pudp);
>> +ÂÂÂ p4d_clear(p4dp);
>> +ÂÂÂ p4d_populate(mm, p4dp, pudp);
>> +ÂÂÂ p4d = READ_ONCE(*p4dp);
>> +ÂÂÂ WARN_ON(p4d_bad(p4d));
>> +}
>> +
>> +static void __init pgd_clear_tests(struct mm_struct *mm, pgd_t *pgdp)
>> +{
>> +ÂÂÂ pgd_t pgd = READ_ONCE(*pgdp);
>> +
>> +ÂÂÂ if (mm_p4d_folded(mm))
>> +ÂÂÂÂÂÂÂ return;
>> +
>> +ÂÂÂ pgd = __pgd(pgd_val(pgd) | RANDOM_ORVALUE);
>> +ÂÂÂ WRITE_ONCE(*pgdp, pgd);
>> +ÂÂÂ pgd_clear(pgdp);
>> +ÂÂÂ pgd = READ_ONCE(*pgdp);
>> +ÂÂÂ WARN_ON(!pgd_none(pgd));
>> +}
>> +
>> +static void __init pgd_populate_tests(struct mm_struct *mm, pgd_t *pgdp,
>> +ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ p4d_t *p4dp)
>> +{
>> +ÂÂÂ pgd_t pgd;
>> +
>> +ÂÂÂ if (mm_p4d_folded(mm))
>> +ÂÂÂÂÂÂÂ return;
>> +
>> +ÂÂÂ /*
>> +ÂÂÂÂ * This entry points to next level page table page.
>> +ÂÂÂÂ * Hence this must not qualify as pgd_bad().
>> +ÂÂÂÂ */
>> +ÂÂÂ p4d_clear(p4dp);
>> +ÂÂÂ pgd_clear(pgdp);
>> +ÂÂÂ pgd_populate(mm, pgdp, p4dp);
>> +ÂÂÂ pgd = READ_ONCE(*pgdp);
>> +ÂÂÂ WARN_ON(pgd_bad(pgd));
>> +}
>> +#else
>> +static void __init p4d_clear_tests(struct mm_struct *mm, p4d_t *p4dp) { }
>> +static void __init pgd_clear_tests(struct mm_struct *mm, pgd_t *pgdp) { }
>> +static void __init p4d_populate_tests(struct mm_struct *mm, p4d_t *p4dp,
>> +ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ pud_t *pudp)
>> +{
>> +}
>> +static void __init pgd_populate_tests(struct mm_struct *mm, pgd_t *pgdp,
>> +ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ p4d_t *p4dp)
>> +{
>> +}
>> +#endif
>> +
>> +static void __init pte_clear_tests(struct mm_struct *mm, pte_t *ptep)
>> +{
>> +ÂÂÂ pte_t pte = READ_ONCE(*ptep);
>> +
>> +ÂÂÂ pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
>> +ÂÂÂ WRITE_ONCE(*ptep, pte);
>> +ÂÂÂ pte_clear(mm, 0, ptep);
>> +ÂÂÂ pte = READ_ONCE(*ptep);
>> +ÂÂÂ WARN_ON(!pte_none(pte));
>> +}
>> +
>> +static void __init pmd_clear_tests(struct mm_struct *mm, pmd_t *pmdp)
>> +{
>> +ÂÂÂ pmd_t pmd = READ_ONCE(*pmdp);
>> +
>> +ÂÂÂ pmd = __pmd(pmd_val(pmd) | RANDOM_ORVALUE);
>> +ÂÂÂ WRITE_ONCE(*pmdp, pmd);
>> +ÂÂÂ pmd_clear(pmdp);
>> +ÂÂÂ pmd = READ_ONCE(*pmdp);
>> +ÂÂÂ WARN_ON(!pmd_none(pmd));
>> +}
>> +
>> +static void __init pmd_populate_tests(struct mm_struct *mm, pmd_t *pmdp,
>> +ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ pgtable_t pgtable)
>> +{
>> +ÂÂÂ pmd_t pmd;
>> +
>> +ÂÂÂ /*
>> +ÂÂÂÂ * This entry points to next level page table page.
>> +ÂÂÂÂ * Hence this must not qualify as pmd_bad().
>> +ÂÂÂÂ */
>> +ÂÂÂ pmd_clear(pmdp);
>> +ÂÂÂ pmd_populate(mm, pmdp, pgtable);
>> +ÂÂÂ pmd = READ_ONCE(*pmdp);
>> +ÂÂÂ WARN_ON(pmd_bad(pmd));
>> +}
>> +
>> +static unsigned long __init get_random_vaddr(void)
>> +{
>> +ÂÂÂ unsigned long random_vaddr, random_pages, total_user_pages;
>> +
>> +ÂÂÂ total_user_pages = (TASK_SIZE - FIRST_USER_ADDRESS) / PAGE_SIZE;
>> +
>> +ÂÂÂ random_pages = get_random_long() % total_user_pages;
>> +ÂÂÂ random_vaddr = FIRST_USER_ADDRESS + random_pages * PAGE_SIZE;
>> +
>> +ÂÂÂ WARN_ON((random_vaddr > TASK_SIZE) ||
>> +ÂÂÂÂÂÂÂ (random_vaddr < FIRST_USER_ADDRESS));
>> +ÂÂÂ return random_vaddr;
>> +}
>> +
>> +void __init debug_vm_pgtable(void)
>> +{
>> +ÂÂÂ struct mm_struct *mm;
>> +ÂÂÂ pgd_t *pgdp;
>> +ÂÂÂ p4d_t *p4dp, *saved_p4dp;
>> +ÂÂÂ pud_t *pudp, *saved_pudp;
>> +ÂÂÂ pmd_t *pmdp, *saved_pmdp, pmd;
>> +ÂÂÂ pte_t *ptep;
>> +ÂÂÂ pgtable_t saved_ptep;
>> +ÂÂÂ pgprot_t prot;
>> +ÂÂÂ phys_addr_t paddr;
>> +ÂÂÂ unsigned long vaddr, pte_aligned, pmd_aligned;
>> +ÂÂÂ unsigned long pud_aligned, p4d_aligned, pgd_aligned;
>
> I think an information message would be nice:
>
> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> index 9472566b7e53..ed7cc3dfc968 100644
> --- a/mm/debug_vm_pgtable.c
> +++ b/mm/debug_vm_pgtable.c
> @@ -313,6 +313,8 @@ void __init debug_vm_pgtable(void)
> ÂÂÂÂ unsigned long vaddr, pte_aligned, pmd_aligned;
> ÂÂÂÂ unsigned long pud_aligned, p4d_aligned, pgd_aligned;
>
> +ÂÂÂ pr_info("Validating architecture page table helpers\n");
> +
> ÂÂÂÂ prot = vm_get_page_prot(VMFLAGS);
> ÂÂÂÂ vaddr = get_random_vaddr();
> ÂÂÂÂ mm = mm_alloc();

Sure, will add. Thanks !

>
> Christophe
>
>> +
>> +ÂÂÂ prot = vm_get_page_prot(VMFLAGS);
>> +ÂÂÂ vaddr = get_random_vaddr();
>> +ÂÂÂ mm = mm_alloc();
>> +ÂÂÂ if (!mm) {
>> +ÂÂÂÂÂÂÂ pr_err("mm_struct allocation failed\n");
>> +ÂÂÂÂÂÂÂ return;
>> +ÂÂÂ }
>> +
>> +ÂÂÂ /*
>> +ÂÂÂÂ * PFN for mapping at PTE level is determined from a standard kernel
>> +ÂÂÂÂ * text symbol. But pfns for higher page table levels are derived by
>> +ÂÂÂÂ * masking lower bits of this real pfn. These derived pfns might not
>> +ÂÂÂÂ * exist on the platform but that does not really matter as pfn_pxx()
>> +ÂÂÂÂ * helpers will still create appropriate entries for the test. This
>> +ÂÂÂÂ * helps avoid large memory block allocations to be used for mapping
>> +ÂÂÂÂ * at higher page table levels.
>> +ÂÂÂÂ */
>> +ÂÂÂ WARN_ON(!virt_addr_valid(&start_kernel));
>> +ÂÂÂ paddr = __pa(&start_kernel);
>> +
>> +ÂÂÂ pte_aligned = (paddr & PAGE_MASK) >> PAGE_SHIFT;
>> +ÂÂÂ pmd_aligned = (paddr & PMD_MASK) >> PAGE_SHIFT;
>> +ÂÂÂ pud_aligned = (paddr & PUD_MASK) >> PAGE_SHIFT;
>> +ÂÂÂ p4d_aligned = (paddr & P4D_MASK) >> PAGE_SHIFT;
>> +ÂÂÂ pgd_aligned = (paddr & PGDIR_MASK) >> PAGE_SHIFT;
>> +
>> +ÂÂÂ pgdp = pgd_offset(mm, vaddr);
>> +ÂÂÂ p4dp = p4d_alloc(mm, pgdp, vaddr);
>> +ÂÂÂ pudp = pud_alloc(mm, p4dp, vaddr);
>> +ÂÂÂ pmdp = pmd_alloc(mm, pudp, vaddr);
>> +ÂÂÂ ptep = pte_alloc_map(mm, pmdp, vaddr);
>> +
>> +ÂÂÂ /*
>> +ÂÂÂÂ * Save all the page table page addresses as the page table
>> +ÂÂÂÂ * entries will be used for testing with random or garbage
>> +ÂÂÂÂ * values. These saved addresses will be used for freeing
>> +ÂÂÂÂ * page table pages.
>> +ÂÂÂÂ */
>> +ÂÂÂ pmd = READ_ONCE(*pmdp);
>> +ÂÂÂ saved_p4dp = p4d_offset(pgdp, 0UL);
>> +ÂÂÂ saved_pudp = pud_offset(p4dp, 0UL);
>> +ÂÂÂ saved_pmdp = pmd_offset(pudp, 0UL);
>> +ÂÂÂ saved_ptep = pmd_pgtable(pmd);
>> +
>> +ÂÂÂ pte_basic_tests(pte_aligned, prot);
>> +ÂÂÂ pmd_basic_tests(pmd_aligned, prot);
>> +ÂÂÂ pud_basic_tests(pud_aligned, prot);
>> +ÂÂÂ p4d_basic_tests(p4d_aligned, prot);
>> +ÂÂÂ pgd_basic_tests(pgd_aligned, prot);
>> +
>> +ÂÂÂ pte_clear_tests(mm, ptep);
>> +ÂÂÂ pmd_clear_tests(mm, pmdp);
>> +ÂÂÂ pud_clear_tests(mm, pudp);
>> +ÂÂÂ p4d_clear_tests(mm, p4dp);
>> +ÂÂÂ pgd_clear_tests(mm, pgdp);
>> +
>> +ÂÂÂ pte_unmap(ptep);
>> +
>> +ÂÂÂ pmd_populate_tests(mm, pmdp, saved_ptep);
>> +ÂÂÂ pud_populate_tests(mm, pudp, saved_pmdp);
>> +ÂÂÂ p4d_populate_tests(mm, p4dp, saved_pudp);
>> +ÂÂÂ pgd_populate_tests(mm, pgdp, saved_p4dp);
>> +
>> +ÂÂÂ p4d_free(mm, saved_p4dp);
>> +ÂÂÂ pud_free(mm, saved_pudp);
>> +ÂÂÂ pmd_free(mm, saved_pmdp);
>> +ÂÂÂ pte_free(mm, saved_ptep);
>> +
>> +ÂÂÂ mm_dec_nr_puds(mm);
>> +ÂÂÂ mm_dec_nr_pmds(mm);
>> +ÂÂÂ mm_dec_nr_ptes(mm);
>> +ÂÂÂ __mmdrop(mm);
>> +}
>>
>