Re: [PATCH 0/2] don't use mapcount() to check large folio sharing

From: Ryan Roberts
Date: Wed Aug 02 2023 - 06:33:35 EST

On 28/07/2023 17:13, Yin Fengwei wrote:
> In madvise_cold_or_pageout_pte_range() and madvise_free_pte_range(),
> folio_mapcount() is used to check whether the folio is shared. But it's
> not correct as folio_mapcount() returns total mapcount of large folio.
> Use folio_estimated_sharers() here as the estimated number is enough.
> Yin Fengwei (2):
> madvise: don't use mapcount() against large folio for sharing check
> madvise: don't use mapcount() against large folio for sharing check
> mm/huge_memory.c | 2 +-
> mm/madvise.c | 6 +++---
> 2 files changed, 4 insertions(+), 4 deletions(-)

As a set of fixes, I agree this is definitely an improvement, so:

Reviewed-By: Ryan Roberts

But I have a couple of comments around further improvements;

Once we have the scheme that David is working on to be able to provide precise
exclusive vs shared info, we will probably want to move to that. Although that
scheme will need access to the mm_struct of a process known to be mapping the
folio. We have that info, but its not passed to folio_estimated_sharers() so we
can't just reimplement folio_estimated_sharers() - we will need to rework these
call sites again.

Given the aspiration for most of the memory to be large folios going forwards,
wouldn't it be better to avoid splitting the large folio where the large folio
is mapped entirely within the range of the madvise operation? Sorry if this has
already been discussed and decided against - I didn't follow the RFC too
closely. Or perhaps you plan to do this as a follow up?