Re: [PATCH v2 00/14] Transparent Contiguous PTEs for User Mappings

From: Barry Song
Date: Sun Nov 26 2023 - 22:18:41 EST


> Ryan Roberts (14):
> mm: Batch-copy PTE ranges during fork()
> arm64/mm: set_pte(): New layer to manage contig bit
> arm64/mm: set_ptes()/set_pte_at(): New layer to manage contig bit
> arm64/mm: pte_clear(): New layer to manage contig bit
> arm64/mm: ptep_get_and_clear(): New layer to manage contig bit
> arm64/mm: ptep_test_and_clear_young(): New layer to manage contig bit
> arm64/mm: ptep_clear_flush_young(): New layer to manage contig bit
> arm64/mm: ptep_set_wrprotect(): New layer to manage contig bit
> arm64/mm: ptep_set_access_flags(): New layer to manage contig bit
> arm64/mm: ptep_get(): New layer to manage contig bit
> arm64/mm: Split __flush_tlb_range() to elide trailing DSB
> arm64/mm: Wire up PTE_CONT for user mappings
> arm64/mm: Implement ptep_set_wrprotects() to optimize fork()
> arm64/mm: Add ptep_get_and_clear_full() to optimize process teardown

Hi Ryan,
Not quite sure if I missed something, are we splitting/unfolding CONTPTES
in the below cases

1. madvise(MADV_DONTNEED) on a part of basepages on a CONTPTE large folio

2. vma split in a large folio due to various reasons such as mprotect,
munmap, mlock etc.

3. try_to_unmap_one() to reclaim a folio, ptes are scanned one by one
rather than being as a whole.

In hardware, we need to make sure CONTPTE follow the rule - always 16
contiguous physical address with CONTPTE set. if one of them run away
from the 16 ptes group and PTEs become unconsistent, some terrible
errors/faults can happen in HW. for example

case0:
addr0 PTE - has no CONTPE
addr0+4kb PTE - has CONTPTE
....
addr0+60kb PTE - has CONTPTE

case 1:
addr0 PTE - has no CONTPE
addr0+4kb PTE - has CONTPTE
....
addr0+60kb PTE - has swap

Unconsistent 16 PTEs will lead to crash even in the firmware based on
our observation.

Thanks
Barry