Re: [RFC PATCH 4/4] x86/mm: write protect (most) page tables

From: Dave Hansen
Date: Mon Aug 23 2021 - 19:50:49 EST


On 8/23/21 6:25 AM, Mike Rapoport wrote:
> void ___pte_free_tlb(struct mmu_gather *tlb, struct page *pte)
> {
> + enable_pgtable_write(page_address(pte));
> pgtable_pte_page_dtor(pte);
> paravirt_release_pte(page_to_pfn(pte));
> paravirt_tlb_remove_table(tlb, pte);
> @@ -69,6 +73,7 @@ void ___pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd)
> #ifdef CONFIG_X86_PAE
> tlb->need_flush_all = 1;
> #endif
> + enable_pgtable_write(pmd);
> pgtable_pmd_page_dtor(page);
> paravirt_tlb_remove_table(tlb, page);
> }

I would expected this to have leveraged the pte_offset_map/unmap() code
to enable/disable write access. Granted, it would enable write access
even when only a read is needed, but that could be trivially fixed with
having a variant like:

pte_offset_map_write()
pte_offset_unmap_write()

in addition to the existing (presumably read-only) versions:

pte_offset_map()
pte_offset_unmap()

Although those only work for the leaf levels, it seems a shame not to to
use them.

I'm also cringing a bit at hacking this into the page allocator. A
*lot* of what you're trying to do with getting large allocations out and
splitting them up is done very well today by the slab allocators. It
might take some rearrangement of 'struct page' metadata to be more slab
friendly, but it does seem like a close enough fit to warrant investigating.