Re: [RFC PATCH v2 3/9] mm, pgtable: Add ownership to PTE table

From: Nadav Amit
Date: Tue Sep 27 2022 - 13:30:50 EST


On Sep 27, 2022, at 9:29 AM, Chih-En Lin <shiyn.lin@xxxxxxxxx> wrote:

> Introduce the ownership to the PTE table. It uses the address of PMD
> index to track the ownership to identify which process can update
> their page table state from the COWed PTE table.
>
> Signed-off-by: Chih-En Lin <shiyn.lin@xxxxxxxxx>
> ---
> include/linux/mm.h | 1 +
> include/linux/mm_types.h | 5 ++++-
> include/linux/pgtable.h | 10 ++++++++++
> 3 files changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 21f8b27bd9fd3..965523dcca3b8 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -2289,6 +2289,7 @@ static inline bool pgtable_pte_page_ctor(struct page *page)
> return false;
> __SetPageTable(page);
> inc_lruvec_page_state(page, NR_PAGETABLE);
> + page->cow_pte_owner = NULL;
> return true;
> }
>
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index cf97f3884fda2..42798b59cec4e 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -152,7 +152,10 @@ struct page {
> struct list_head deferred_list;
> };
> struct { /* Page table pages */
> - unsigned long _pt_pad_1; /* compound_head */
> + union {
> + unsigned long _pt_pad_1; /* compound_head */
> + pmd_t *cow_pte_owner; /* cow pte: pmd */
> + };
> pgtable_t pmd_huge_pte; /* protected by page->ptl */
> unsigned long _pt_pad_2; /* mapping */
> union {
> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
> index d03d01aefe989..9dca787a3f4dd 100644
> --- a/include/linux/pgtable.h
> +++ b/include/linux/pgtable.h
> @@ -615,6 +615,16 @@ static inline int pte_unused(pte_t pte)
> }
> #endif
>
> +static inline void set_cow_pte_owner(pmd_t *pmd, pmd_t *owner)
> +{
> + smp_store_release(&pmd_page(*pmd)->cow_pte_owner, owner);
> +}
> +
> +static inline bool cow_pte_owner_is_same(pmd_t *pmd, pmd_t *owner)
> +{
> + return smp_load_acquire(&pmd_page(*pmd)->cow_pte_owner) == owner;
> +}

Hiding synchronization primitives in such manner, and especially without
proper comments, makes it hard to understand what the ordering is supposed
to achieve.