[PATCH V2] mm: madvise: fix uneven accounting of psi

From: Charan Teja Kalla
Date: Tue Jun 27 2023 - 06:33:52 EST


A folio turns into a Workingset during:
1) shrink_active_list() placing the folio from active to inactive list.
2) When a workingset transition is happening during the folio refault.

And when Workingset is set on a folio, PSI for memory can be accounted
during a) That folio is being reclaimed and b) Refault of that folio.

This accounting of PSI for memory is not consistent in the cases where
clients use madvise(COLD/PAGEOUT) to deactivate or proactively reclaim a
folio:
a) A folio started at inactive and moved to active as part of accesses.
Workingset is absent on the folio thus madvise(MADV_PAGEOUT) don't
account such folios for PSI.

b) When the same folio transition from inactive->active and then to
inactive through shrink_active_list(). Workingset is set on the folio
thus madvise(MADV_PAGEOUT) account such folios for PSI.

c) When the same folio is part of active list directly as a result of
folio refault and this was a workingset folio prior to eviction.
Workingset is set on the folio thus both the operations of MADV_PAGEOUT
and reclaim of the MADV_COLD operated folio account for PSI.

d) madvise(MADV_COLD) transfers the folio from active list to inactive
list. Such folios may not have the Workingset thus reclaim operation
on such folio doesn't account for PSI.

As said above, the MADV_PAGEOUT on a folio is accounts for memory PSI in
b) and c) but not in a). Reclaim of a folio on which MADV_COLD is
performed accounts memory PSI in c) but not in d) which is an
inconsistent behaviour. Make this PSI accounting always consistent by
turning a folio into a workingset one whenever it is leaving the active
list. Also, accounting of PSI on a folio whenever it leaves the
active list as part of the MADV_COLD/PAGEOUT operation helps the users
whether they are operating on proper folios[1].

[1] https://lore.kernel.org/all/20230605180013.GD221380@xxxxxxxxxxx/

Suggested-by: Suren Baghdasaryan <surenb@xxxxxxxxxx>
Reported-by: Sai Manobhiram Manapragada <quic_smanapra@xxxxxxxxxxx>
Reported-by: Pavan Kondeti <quic_pkondeti@xxxxxxxxxxx>
Signed-off-by: Charan Teja Kalla <quic_charante@xxxxxxxxxxx>
---
V2: Made changes as per the comments from Johannes/Suren.

V1: https://lore.kernel.org/all/1685531374-6091-1-git-send-email-quic_charante@xxxxxxxxxxx/

mm/madvise.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/mm/madvise.c b/mm/madvise.c
index d9e7b42..76fb31f 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -413,6 +413,7 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd,

folio_clear_referenced(folio);
folio_test_clear_young(folio);
+ folio_set_workingset(folio);
if (pageout) {
if (folio_isolate_lru(folio)) {
if (folio_test_unevictable(folio))
@@ -512,6 +513,7 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd,
*/
folio_clear_referenced(folio);
folio_test_clear_young(folio);
+ folio_set_workingset(folio);
if (pageout) {
if (folio_isolate_lru(folio)) {
if (folio_test_unevictable(folio))
--
2.7.4