Re: [PATCH V2] mm: madvise: fix uneven accounting of psi

From: Charan Teja Kalla
Date: Wed Jun 28 2023 - 06:49:29 EST


Hi Pavan,

On 6/27/2023 7:26 PM, Pavan Kondeti wrote:
>> A folio turns into a Workingset during:
>> 1) shrink_active_list() placing the folio from active to inactive list.
>> 2) When a workingset transition is happening during the folio refault.
>>
>> And when Workingset is set on a folio, PSI for memory can be accounted
>> during a) That folio is being reclaimed and b) Refault of that folio.
>>
> Please help me understand why PSI for memory (I understood it as the
> time spent in psi_memstall_enter() to psi_memstall_leave()) would be
> accounted in (a) i.e during reclaim. I understand that when a working
>
> The (b) part is very clear.
>
I meant to say, for usual reclaim, PSI is accounted on a folio for both
reclaim and as well during the refault operation when Workingset is set
on a folio i.e., both a) and b) cases above.

>> This accounting of PSI for memory is not consistent in the cases where
>> clients use madvise(COLD/PAGEOUT) to deactivate or proactively reclaim a
>> folio:

Seems I need to be explicit here. How about the below?

This accounting of PSI for memory is not consistent for reclaim +
refault operation between usual reclaim and madvise(COLD/PAGEOUT) which
deactivate or proactively reclaim a folio:

lmk for any better rephrasing?
>> a) A folio started at inactive and moved to active as part of accesses.
>> Workingset is absent on the folio thus madvise(MADV_PAGEOUT) don't
>> account such folios for PSI.
>>
>> b) When the same folio transition from inactive->active and then to
>> inactive through shrink_active_list(). Workingset is set on the folio
>> thus madvise(MADV_PAGEOUT) account such folios for PSI.
>>
>> c) When the same folio is part of active list directly as a result of
>> folio refault and this was a workingset folio prior to eviction.
>> Workingset is set on the folio thus both the operations of MADV_PAGEOUT
>> and reclaim of the MADV_COLD operated folio account for PSI.
>>
>> d) madvise(MADV_COLD) transfers the folio from active list to inactive
>> list. Such folios may not have the Workingset thus reclaim operation
>> on such folio doesn't account for PSI.
> This is not limited to madvise(PAGEOUT) right, anywhere an active page
> is reclaimed we have the same problem. For ex: damon_pa_pageout() and
> __alloc_contig_migrate_range()->reclaim_clean_pages_from_list().
>> If that is the case, can we set mark a folio as a workingset when it is
> activated? That way, we don't have make madvise() as a special case?
I think marking the folio as a workingset when it sits on the active is
not a correct thing. For the same example you mentioned, a simple CMA
allocation will be dropping the clean pages instead of migration. PSI
accounting on refault of those pages don't reveal anything to the user.

Where as in the madvise() cases, this PSI tells the user about the type
of pages that he is working on.[1]

BTW, damon_pa_pageout() seems a valid case above. let me fix it in the
next patch.

[1]https://lore.kernel.org/all/20230605180013.GD221380@xxxxxxxxxxx/