Re: [RESEND PATCH v7 00/10] Small-sized THP for anonymous memory

From: Alistair Popple
Date: Mon Nov 27 2023 - 03:23:23 EST



David Hildenbrand <david@xxxxxxxxxx> writes:

> On 24.11.23 16:53, Matthew Wilcox wrote:
>> On Fri, Nov 24, 2023 at 04:25:38PM +0100, David Hildenbrand wrote:
>>> On 24.11.23 16:13, Matthew Wilcox wrote:
>>>> On Fri, Nov 24, 2023 at 09:56:37AM +0000, Ryan Roberts wrote:
>>>>> On 23/11/2023 15:59, Matthew Wilcox wrote:
>>>>>> On Wed, Nov 22, 2023 at 04:29:40PM +0000, Ryan Roberts wrote:
>>>>>>> This is v7 of a series to implement small-sized THP for anonymous memory
>>>>>>> (previously called "large anonymous folios"). The objective of this is to
>>>>>>
>>>>>> I'm still against small-sized THP. We've now got people asking whether
>>>>>> the THP counters should be updated when dealing with large folios that
>>>>>> are smaller than PMD sized. It's sowing confusion, and we should go
>>>>>> back to large anon folios as a name.
>>>>>
>>>>> I suspect I'm labouring the point here, but I'd like to drill into exactly what
>>>>> you are objecting to. Is it:
>>>>>
>>>>> A) Using the name "small-sized THP" (which is currently only used in the commit
>>>>> logs and a couple of times in the documentation).
>>>>
>>>> Yes, this is what I'm objecting to.
>>>
>>> I'll just repeat that "large anon folio" is misleading, because
>>> * we already have "large anon folios" in hugetlb
>> We do? Where?
>
> MAP_PRIVATE of hugetlb. hugepage_add_anon_rmap() instantiates them.
>
> Hugetlb is likely one of the oldest user of compund pages aka large folios.

I don't like "large anon folios" because it seems to confuse collegaues
when explaining that large anon folios are actually smaller than the
existing Hugetlb/THP size. I suspect this is because they already assume
large folios are used for THP. I guess this wouldn't be an issue if
everyone assumed THP was implemented with huge folios, but that doesn't
seem to be the case for me at least. Likely because the default THP size
is often 2MB, which is hardly huge.

>>
>>> * we already have PMD-sized "large anon folios" in THP
>> Right, those are already accounted as THP, and that's what users
>> expect.
>> If we're allocating 1024 x 64kB chunks of memory, the user won't be able
>> to distinguish that from 32 x 2MB chunks of memory, and yet the
>> performance profile for some applications will be very different.
>
> Very right, and because there will be a difference between 1024 x
> 64kB, 2048 x 32 kB and so forth, we need new memory stats either way.
>
> Ryan had some ideas on that, but currently, that's considered future
> work, just like it likely is for the pagecache as well and needs much
> more thoughts.
>
> Initially, the admin will have to enable all that for anon either
> way. It all boils down to one memory statistic for anon memory
> (AnonHugePages) that's messed-up already.
>
>>
>>> But inn the end, I don't care how we will call this in a commit message.
>>>
>>> Just sticking to what we have right now makes most sense to me.
>>>
>>> I know, as the creator of the term "folio" you have to object :P Sorry ;)
>> I don't care if it's called something to do with folios or not. I
>
> Good!
>
>> am objecting to the use of the term "small THP" on the grounds of
>> confusion and linguistic nonsense.
>
> Maybe that's the reason why FreeBSD calls them "medium-sized
> superpages", because "Medium-sized" seems to be more appropriate to
> express something "in between".

Transparent Medium Pages?

> So far I thought the reason was because they focused on 64k only.
>
> Never trust a German guy on naming suggestions. John has so far been
> my naming expert, so I'm hoping he can help.

Likewise :-)

> "Sub-pmd-sized THP" is just mouthful. But then, again, this is would
> just be a temporary name, and in the future THP will just naturally
> come in multiple sizes (and others here seem to agree on that).
>
>
> But just to repeat: I don't think there is need to come up with new
> terminology and that there will be mass-confusion. So far I've not
> heard a compelling argument besides "one memory counter could confuse
> an admin that explicitly enables that new behavior.".
>
> Side note: I'm, happy that we've reached a stage where we're
> nitpicking on names :)