Re: [PATCH v2 0/5] variable-order, large folios for anonymous memory

From: Ryan Roberts
Date: Mon Jul 10 2023 - 06:10:54 EST


On 07/07/2023 14:24, David Hildenbrand wrote:
> On 07.07.23 15:12, Matthew Wilcox wrote:
>> On Fri, Jul 07, 2023 at 01:40:53PM +0200, David Hildenbrand wrote:
>>> On 06.07.23 10:02, Ryan Roberts wrote:
>>> But can you comment on the page migration part (IOW did you try it already)?
>>>
>>> For example, memory hotunplug, CMA, MCE handling, compaction all rely on
>>> page migration of something that was allocated using GFP_MOVABLE to actually
>>> work.
>>>
>>> Compaction seems to skip any higher-order folios, but the question is if the
>>> udnerlying migration itself works.
>>>
>>> If it already works: great! If not, this really has to be tackled early,
>>> because otherwise we'll be breaking the GFP_MOVABLE semantics.
>>
>> I have looked at this a bit.  _Migration_ should be fine.  _Compaction_
>> is not.
>
> Thanks! Very nice if at least ordinary migration works.

That's good to hear - I hadn't personally investigated.

>
>>
>> If you look at a function like folio_migrate_mapping(), it all seems
>> appropriately folio-ised.  There might be something in there that is
>> slightly wrong, but that would just be a bug to fix, not a huge
>> architectural problem.
>>
>> The problem comes in the callers of migrate_pages().  They pass a
>> new_folio_t callback.  alloc_migration_target() is the usual one passed
>> and as far as I can tell is fine.  I've seen no problems reported with it.
>>
>> compaction_alloc() is a disaster, and I don't know how to fix it.
>> The compaction code has its own allocator which is populated with order-0
>> folios.  How it populates that freelist is awful ... see split_map_pages()

I think this compaction issue also affects large folios in the page cache? So
really it is a pre-existing bug in the code base that needs to be fixed
independently of large anon folios? Should I assume you are tackling this, Matthew?

>
> Yeah, all that code was written under the assumption that we're moving order-0
> pages (which is what the anon+pagecache pages part).
>
> From what I recall, we're allocating order-0 pages from the high memory
> addresses, so we can migrate from low memory addresses, effectively freeing up
> low memory addresses and filling high memory addresses.
>
> Adjusting that will be ... interesting. Instead of allocating order-0 pages from
> high addresses, we might want to allocate "as large as possible" ("grab what we
> can") from high addresses and then have our own kind of buddy for allocating
> from that pool a compaction destination page, depending on our source page. Nasty.
>
> What should always work is the split->migrate. But that's definitely not what we
> want in many cases.
>
>>
>>> Is swapping working as expected? zswap?
>>
>> Suboptimally.  Swap will split folios in order to swap them.  Somebody
>> needs to fix that, but it should work.
>
> Good!
>
> It would be great to have some kind of a feature matrix that tells us what works
> perfectly, sub-optimally, barely, not at all (and what has not been tested).
> Maybe (likely!) we'll also find things that are sub-optimal for ordinary THP
> (like swapping, not even sure about).

I'm building a list of known issues, but so far it has been based on code I've
found during review and things raised by people in these threads. Are there test
suites that explicitly test these features? If so I'll happily run them against
large anon folios, but at the moment I'm ignorant I'm afraid. I have been trying
to get mm selftests up and running, but I currently have a bunch of failures on
arm64, even without any of my patches - somthing I'm working through.

>
> I suspect that KSM should work mostly fine with flexible-thp. When
> deduplciating, we'll simply split the compound page and proceed as expected. But
> might be worth testing as well.
>