Re: [PATCH 2/2] mm: mmap: map MAP_STACK to VM_NOHUGEPAGE

From: Yang Shi
Date: Wed Jan 31 2024 - 13:47:08 EST


On Tue, Jan 30, 2024 at 11:53 PM Florian Weimer <fweimer@xxxxxxxxxx> wrote:
>
> * Yang Shi:
>
> > From: Yang Shi <yang@xxxxxxxxxxxxxxxxxxxxxx>
> >
> > The commit efa7df3e3bb5 ("mm: align larger anonymous mappings on THP
> > boundaries") incured regression for stress-ng pthread benchmark [1].
> > It is because THP get allocated to pthread's stack area much more possible
> > than before. Pthread's stack area is allocated by mmap without VM_GROWSDOWN
> > or VM_GROWSUP flag, so kernel can't tell whether it is a stack area or not.
> >
> > The MAP_STACK flag is used to mark the stack area, but it is a no-op on
> > Linux. Mapping MAP_STACK to VM_NOHUGEPAGE to prevent from allocating
> > THP for such stack area.
>
> Doesn't this introduce a regression in the other direction, where
> workloads expect to use a hugepage TLB entry for the stack?

Maybe, it is theoretically possible. But AFAICT, the real life
workloads performance usually gets hurt if THP is used for stack.
Willy has an example:

https://lore.kernel.org/linux-mm/ZYPDwCcAjX+r+g6s@xxxxxxxxxxxxxxxxxxxx/#t

And avoiding THP on stack is not new, VM_GROWSDOWN | VM_GROWSUP areas
have been applied before, this patch just extends this to MAP_STACK.

>
> It's seems an odd approach to fixing the stress-ng regression. Isn't it
> very much coding to the benchmark?
>
> Thanks,
> Florian
>