Re: [PATCH v2] mm: remove redundant lru_add_drain() prior to unmapping pages

From: Jianfeng Wang
Date: Fri Dec 15 2023 - 03:48:48 EST


On 12/14/23 3:00 PM, Matthew Wilcox wrote:
> On Thu, Dec 14, 2023 at 02:27:17PM -0800, Jianfeng Wang wrote:
>> When unmapping VMA pages, pages will be gathered in batch and released by
>> tlb_finish_mmu() if CONFIG_MMU_GATHER_NO_GATHER is not set. The function
>> tlb_finish_mmu() is responsible for calling free_pages_and_swap_cache(),
>> which calls lru_add_drain() to drain cached pages in folio_batch before
>> releasing gathered pages. Thus, it is redundant to call lru_add_drain()
>> before gathering pages, if CONFIG_MMU_GATHER_NO_GATHER is not set.
>>
>> Remove lru_add_drain() prior to gathering and unmapping pages in
>> exit_mmap() and unmap_region() if CONFIG_MMU_GATHER_NO_GATHER is not set.
>>
>> Note that the page unmapping process in oom_killer (e.g., in
>> __oom_reap_task_mm()) also uses tlb_finish_mmu() and does not have
>> redundant lru_add_drain(). So, this commit makes the code more consistent.
>
> Shouldn't we put this in __tlb_gather_mmu() which already has the
> CONFIG_MMU_GATHER_NO_GATHER ifdefs? That would presuambly help with, eg
> zap_page_range_single() too.
>

After looking at different use cases of tlb_gather_mmu(), I feel it is
questionable to move lru_add_drain() into __tlb_gather_mmu(). There are
two use cases of tlb_gather_mmu(): one for unmapping and releasing pages
(e.g., the two cases in mmap.c); the other one is to update page table
entries and flush TLB without releasing pages (e.g., together with
mprotect_fixup()). For the latter use case, it is reasonable to not call
lru_add_drain() prior to or within tlb_gather_mmu().

Of course, we may update tlb_gather_mmu()'s API to take this into account.
For example, we can have tlb_gather_mmu_for_release() for the first case
and tlb_gather_mmu() for the latter. I'd like to have your opinion on this.
Thanks!

>> Signed-off-by: Jianfeng Wang <jianfeng.w.wang@xxxxxxxxxx>
>> ---
>> mm/mmap.c | 6 ++++++
>> 1 file changed, 6 insertions(+)
>>
>> diff --git a/mm/mmap.c b/mm/mmap.c
>> index 1971bfffcc03..da0308eef435 100644
>> --- a/mm/mmap.c
>> +++ b/mm/mmap.c
>> @@ -2330,7 +2330,10 @@ static void unmap_region(struct mm_struct *mm, struct ma_state *mas,
>> struct mmu_gather tlb;
>> unsigned long mt_start = mas->index;
>>
>> + /* Defer lru_add_drain() to tlb_finish_mmu() for the ifndef case. */
>> +#ifdef CONFIG_MMU_GATHER_NO_GATHER
>> lru_add_drain();
>> +#endif
>> tlb_gather_mmu(&tlb, mm);
>> update_hiwater_rss(mm);
>> unmap_vmas(&tlb, mas, vma, start, end, tree_end, mm_wr_locked);
>> @@ -3300,7 +3303,10 @@ void exit_mmap(struct mm_struct *mm)
>> return;
>> }
>>
>> + /* Defer lru_add_drain() to tlb_finish_mmu() for the ifndef case. */
>> +#ifdef CONFIG_MMU_GATHER_NO_GATHER
>> lru_add_drain();
>> +#endif
>> flush_cache_mm(mm);
>> tlb_gather_mmu_fullmm(&tlb, mm);
>> /* update_hiwater_rss(mm) here? but nobody should be looking */
>> --
>> 2.42.1
>>
>>