Re: [PATCH 05/12] powerpc: add pte_free_defer() for pgtables sharing page

From: Jason Gunthorpe
Date: Tue Jun 06 2023 - 14:24:25 EST


On Mon, Jun 05, 2023 at 08:40:01PM -0700, Hugh Dickins wrote:

> diff --git a/arch/powerpc/mm/pgtable-frag.c b/arch/powerpc/mm/pgtable-frag.c
> index 20652daa1d7e..e4f58c5fc2ac 100644
> --- a/arch/powerpc/mm/pgtable-frag.c
> +++ b/arch/powerpc/mm/pgtable-frag.c
> @@ -120,3 +120,54 @@ void pte_fragment_free(unsigned long *table, int kernel)
> __free_page(page);
> }
> }
> +
> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> +#define PTE_FREE_DEFERRED 0x10000 /* beyond any PTE_FRAG_NR */
> +
> +static void pte_free_now(struct rcu_head *head)
> +{
> + struct page *page;
> + int refcount;
> +
> + page = container_of(head, struct page, rcu_head);
> + refcount = atomic_sub_return(PTE_FREE_DEFERRED - 1,
> + &page->pt_frag_refcount);
> + if (refcount < PTE_FREE_DEFERRED) {
> + pte_fragment_free((unsigned long *)page_address(page), 0);
> + return;
> + }

>From what I can tell power doesn't recycle the sub fragment into any
kind of free list. It just waits for the last fragment to be unused
and then frees the whole page.

So why not simply go into pte_fragment_free() and do the call_rcu directly:

BUG_ON(atomic_read(&page->pt_frag_refcount) <= 0);
if (atomic_dec_and_test(&page->pt_frag_refcount)) {
if (!kernel)
pgtable_pte_page_dtor(page);
call_rcu(&page->rcu_head, free_page_rcu)

?

Jason