Re: [PATCH 2/2] fork: group allocation of per-cpu counters for mm struct

From: Matthew Wilcox
Date: Mon Aug 21 2023 - 17:20:40 EST


On Mon, Aug 21, 2023 at 10:28:29PM +0200, Mateusz Guzik wrote:
> While with the patch these allocations remain a significant problem,
> the primary bottleneck shifts to:
>
> __pv_queued_spin_lock_slowpath+1
> _raw_spin_lock_irqsave+57
> folio_lruvec_lock_irqsave+91
> release_pages+590
> tlb_batch_pages_flush+61
> tlb_finish_mmu+101
> exit_mmap+327
> __mmput+61
> begin_new_exec+1245
> load_elf_binary+712
> bprm_execve+644
> do_execveat_common.isra.0+429
> __x64_sys_execve+50

Looking at this more closely, I don't think the patches I sent are going
to help much. I'd say the primary problem you have is that you're trying
to free _a lot_ of pages at once on all CPUs. Since it's the exit_mmap()
path, these are going to be the anonymous pages allocated to this task
(not the file pages it has mmaped). The large anonymous folios work may
help you out here by decreasing the number of folios we have to manage,
and thus the length of time the LRU lock has to be held for. It's not
an immediate solution, but I think it'll do the job once it lands.