Re: [PATCH v3 07/13] s390: add pte_free_defer() for pgtables sharing page

From: Claudio Imbrenda
Date: Wed Jul 19 2023 - 10:27:15 EST


On Tue, 11 Jul 2023 21:38:35 -0700 (PDT)
Hugh Dickins <hughd@xxxxxxxxxx> wrote:

[...]

> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> +void pte_free_defer(struct mm_struct *mm, pgtable_t pgtable)
> +{
> + struct page *page;
> +
> + page = virt_to_page(pgtable);
> + SetPageActive(page);
> + page_table_free(mm, (unsigned long *)pgtable);
> + /*
> + * page_table_free() does not do the pgste gmap_unlink() which
> + * page_table_free_rcu() does: warn us if pgste ever reaches here.
> + */
> + WARN_ON_ONCE(mm_alloc_pgste(mm));

it seems I have overlooked something when we previously discussed
this...

mm_alloc_pgste() is true for all processes that have PGSTEs, not only
for processes that can run guests.

There are two ways to enable PGSTEs: an ELF header bit, and a sysctl
knob.

The ELF bit is only used by qemu, it enables PGSTE allocation only for
that single process. This is a strong indication that the process wants
to run guests.

The sysctl knob enables PGSTE allocation for every process in the system
from that moment on. In that case, the WARN_ON_ONCE would be triggered
when not necessary.

There is however another way to check if a process is actually
__using__ the PGSTEs, a.k.a. if the process is actually capable of
running guests.

Confusingly, the name of that function is mm_has_pgste(). This confused
me as well, which is why I didn't notice it when we discussed this
previously :)


in short: can you please use mm_has_pgste() instead of mm_alloc_pgste()
in the WARN_ON_ONCE ?

> +}
> +#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
> +
> /*
> * Base infrastructure required to generate basic asces, region, segment,
> * and page tables that do not make use of enhanced features like EDAT1.