[RFC PATCH v2 0/4] fix large folio for madvise_cold_or_pageout()

From: Yin Fengwei
Date: Fri Jul 21 2023 - 05:41:56 EST


Current madvise_cold_or_pageout_pte_range() has two problems to deal
with large folio:
- Using folio_mapcount() with large folio prevent large folio from
picking up.
- always try to split large folio to normal 4K page.

Try to address these two problems by:
- Use folio_estimated_sharers() with large folio. With assumption that
the estimated result of whether the large folio is shared or not is
enough here.

- If the large folio is in the range, don't split it. Leave to page
reclaim as page reclaim can support swap large folio out as whole in
the future.

- Only split the large folio if it crosses the boundaries of the
range. If folio splitting fails, just skip the folio as madvise allows
some pages in the range are ignored.

Patch1 uses folio_estimated_sharers() to replace folio_mapcount().
Patch2 uses API pmdp_clear_flush_young_notify() to clear A bit of page
table. It also notifies the mm subscripter about the A bit clearing.
Patch3 introduce help function to check whether the folio crosses range
boundary.
Patch4 avoid splitting large folio if folio is in the range.

Changes from V1:
- Split patch1 out as Yu's suggestion
- Split patch2 out as Yu's suggestion
- Handle cold case correctly (cold operation was broken in V1 patch)
- rebase the patchset to latest mm-unstable

Testing done:
- mm selftest without new regression.

V1's link:
https://lore.kernel.org/linux-mm/20230713150558.200545-1-fengwei.yin@xxxxxxxxx/

Yin Fengwei (4):
madvise: not use mapcount() against large folio for sharing check
madvise: Use notify-able API to clear and flush page table entries
mm: add functions folio_in_range() and folio_within_vma()
madvise: avoid trying to split large folio always in cold_pageout

mm/internal.h | 42 ++++++++++++++++
mm/madvise.c | 136 +++++++++++++++++++++++++++++++-------------------
2 files changed, 127 insertions(+), 51 deletions(-)

--
2.39.2