Re: [PATCH v2 4/5] mm: FLEXIBLE_THP for improved performance

From: Ryan Roberts
Date: Fri Jul 07 2023 - 05:52:56 EST


On 07/07/2023 09:01, Huang, Ying wrote:
> Ryan Roberts <ryan.roberts@xxxxxxx> writes:
>
>> Introduce FLEXIBLE_THP feature, which allows anonymous memory to be
>> allocated in large folios of a specified order. All pages of the large
>> folio are pte-mapped during the same page fault, significantly reducing
>> the number of page faults. The number of per-page operations (e.g. ref
>> counting, rmap management lru list management) are also significantly
>> reduced since those ops now become per-folio.
>
> I likes the idea to share as much code as possible between large
> (anonymous) folio and THP. Finally, THP becomes just a special kind of
> large folio.
>
> Although we can use smaller page order for FLEXIBLE_THP, it's hard to
> avoid internal fragmentation completely. So, I think that finally we
> will need to provide a mechanism for the users to opt out, e.g.,
> something like "always madvise never" via
> /sys/kernel/mm/transparent_hugepage/enabled. I'm not sure whether it's
> a good idea to reuse the existing interface of THP.

I wouldn't want to tie this to the existing interface, simply because that
implies that we would want to follow the "always" and "madvise" advice too; That
means that on a thp=madvise system (which is certainly the case for android and
other client systems) we would have to disable large anon folios for VMAs that
haven't explicitly opted in. That breaks the intention that this should be an
invisible performance boost. I think it's important to set the policy for use of
THP separately to use of large anon folios.

I could be persuaded on the merrits of a new runtime enable/disable interface if
there is concensus.

>
> Best Regards,
> Huang, Ying