Re: [PATCH v2 4/9] mm/swap: always account swapped in page into current memcg

From: Huang, Ying
Date: Mon Jan 08 2024 - 02:46:34 EST


Kairui Song <ryncsn@xxxxxxxxx> writes:

> Huang, Ying <ying.huang@xxxxxxxxx> 于2024年1月5日周五 15:16写道:
>>
>> Kairui Song <ryncsn@xxxxxxxxx> writes:
>>
>> > From: Kairui Song <kasong@xxxxxxxxxxx>
>> >
>> > Currently, mem_cgroup_swapin_charge_folio is always called with
>> > mm argument as NULL, except in swapin_direct.
>> >
>> > swapin_direct is used when swapin should skip readahead and
>> > swapcache (SWP_SYNCHRONOUS_IO). Other caller paths of
>> > mem_cgroup_swapin_charge_folio are for swapin that should
>> > not skip readahead and cache.
>> >
>> > This could cause swapin charging to behave differently depending
>> > on swap device. This currently didn't happen because the only call
>> > path of swapin_direct is the direct anon page fault path, where mm
>> > equals to current->mm, but will no longer be true if swapin_direct
>> > is shared and have other callers (eg, swapoff).
>> >
>> > So make swapin_direct also passes NULL for mm, no feature change.
>> >
>> > Signed-off-by: Kairui Song <kasong@xxxxxxxxxxx>
>> > ---
>> > mm/swap_state.c | 2 +-
>> > 1 file changed, 1 insertion(+), 1 deletion(-)
>> >
>> > diff --git a/mm/swap_state.c b/mm/swap_state.c
>> > index 6130de8d5226..d39c5369da21 100644
>> > --- a/mm/swap_state.c
>> > +++ b/mm/swap_state.c
>> > @@ -881,7 +881,7 @@ struct folio *swapin_direct(swp_entry_t entry, gfp_t gfp_mask,
>> > folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0,
>> > vma, vmf->address, false);
>> > if (folio) {
>> > - if (mem_cgroup_swapin_charge_folio(folio, vma->vm_mm,
>> > + if (mem_cgroup_swapin_charge_folio(folio, NULL,
>> > GFP_KERNEL, entry)) {
>> > folio_put(folio);
>> > return NULL;
>>
>> I think that why not provide "mm" when it's available? For
>> swapin_direct() called by do_swap_page(), mm can be provided. While,
>> for swapin_direct() called by shmem swapin, mm will be NULL. We can
>> even provide "mm" for __read_swap_cache_async() for VMA based swapin and
>> for the fault address for cluster swapin.
>
> Hi, Ying.
>
> Thanks for the comment.
>
> Without modifying too much code, providing mm here will change swapin
> charge behaviour on swapoff, we discussed it previously:
> https://lkml.org/lkml/2023/11/19/320

It's better to use "lore" for kernel email link, for example,

https://lore.kernel.org/lkml/20231119194740.94101-24-ryncsn@xxxxxxxxx/

This is more readable than the above link.

> If we want to avoid the behavior change, we have to extend all direct
> and indirect callers of mem_cgroup_swapin_charge_folio to accept a
> standalone mm argument (including swapin_direct, swap_vma_readahead,
> swap_cluster_readahead, __read_swap_cache_async, swapin_entry_mpol,
> read_swap_cache_async, and some other path may need more audit), and
> sort things our case by case. I'm not sure if that's a good idea...
>
> Simply dropping it here seems the easiest way to avoid such change.

OK. Didn't realize that they are related directly. It appears that we
always pass NULL as mm to mem_cgroup_swapin_charge_folio(). If so,
please just remove the parameter.

--
Best Regards,
Huang, Ying