Re: [PATCH v3] mm: add thp_utilization metrics to /proc/thp_utilization

From: Alex Zhu (Kernel)
Date: Wed Aug 10 2022 - 17:39:37 EST




> On Aug 10, 2022, at 10:54 AM, Yu Zhao <yuzhao@xxxxxxxxxx> wrote:
>
> !-------------------------------------------------------------------|
> This Message Is From an External Sender
>
> |-------------------------------------------------------------------!
>
> On Wed, Aug 10, 2022 at 11:15 AM Alex Zhu (Kernel) <alexlzhu@xxxxxx> wrote:
>>
>>
>>> On Aug 10, 2022, at 10:07 AM, Yang Shi <shy828301@xxxxxxxxx> wrote:
>>>
>>> On Tue, Aug 9, 2022 at 4:36 PM Yu Zhao <yuzhao@xxxxxxxxxx> wrote:
>>>>
>>>> On Tue, Aug 9, 2022 at 11:16 AM Alex Zhu (Kernel) <alexlzhu@xxxxxx> wrote:
>>>>>
>>>>>
>>>>>> OK, it is hard to tell what it looks like now. But the THPs on the
>>>>>> deferred split list may be on the "low utilization split" list too?
>>>>>> IIUC the major difference is to replace zero-filled subpage to special
>>>>>> zero page, so you implemented another THP split function to handle it?
>>>>>>
>>>>>> Anyway the code should answer the most questions.
>>>>>
>>>>> They can indeed end up on both lists. This did have to be handled when
>>>>> implementing the shrinker.
>>>>>
>>>>> We free the zero filled subpages, while modifying the existing split_huge_page()
>>>>> function. Will follow up that change in another patch.
>>>>
>>>> FYI. This series does it:
>>>>
>>>> https://lore.kernel.org/r/20210731063938.1391602-1-yuzhao@xxxxxxxxxx/
>>>>
>>>> And this one:
>>>>
>>>> https://lore.kernel.org/r/1635422215-99394-1-git-send-email-ningzhang@xxxxxxxxxxxxxxxxx/
>>>
>>> Thanks, Yu. I totally forgot about these series. It is time to refresh
>>> my memory.
>>
>> I looked through these patches yesterday. There are indeed parts that are very similar, but the approach
>> taken seems overly complicated compared to what I have written. What’s the status of work on this since last year?
>
> Overly complicated... which patches and how?
>
> At a minimum, you'd need 1 & 3 from the first series and this patch:
>
> https://lore.kernel.org/r/20220608141432.23258-1-linmiaohe@xxxxxxxxxx/

The changes from the previous patches implement freeing of THPs as part of memcgroup and reclaim. Zero tail pages are disposed of via
lruvec as part of reclaim.

Our approach is a thp utilization worker thread scanning through physical memory adding under utilized THPs to a shrinker that calls split_huge_page(). We free zero tail pages within split_huge_page(). Reclaim will trigger the shrinker.

There is some overlap between the implementations, in particular creating a linked list in the third tail page and methods to check for zero pages.
(I believe the previous patches have a cleaner method for identifying zero pages). However, looking through the code I do believe our approach is simpler.

We chose to free within split_huge_page(), but it’s worth discussing whether to free zero pages immediately or to add to lruvec to free eventually.

I believe the split_huge_page() changes could be valuable by as a patch by itself though. Will send that out shortly.