Re: [PATCH v2 0/5] variable-order, large folios for anonymous memory

From: David Hildenbrand
Date: Fri Jul 07 2023 - 09:25:28 EST

Next message: Michael Ellerman: "[GIT PULL] Please pull powerpc/linux.git powerpc-6.5-2 tag"
Previous message: Alexey Brodkin: "Re: [PATCH 17/18] drm/arcpgu: Convert to devm_platform_ioremap_resource()"
In reply to: Matthew Wilcox: "Re: [PATCH v2 0/5] variable-order, large folios for anonymous memory"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 07.07.23 15:12, Matthew Wilcox wrote:

On Fri, Jul 07, 2023 at 01:40:53PM +0200, David Hildenbrand wrote:

On 06.07.23 10:02, Ryan Roberts wrote:
But can you comment on the page migration part (IOW did you try it already)?

For example, memory hotunplug, CMA, MCE handling, compaction all rely on
page migration of something that was allocated using GFP_MOVABLE to actually
work.

Compaction seems to skip any higher-order folios, but the question is if the
udnerlying migration itself works.

If it already works: great! If not, this really has to be tackled early,
because otherwise we'll be breaking the GFP_MOVABLE semantics.

I have looked at this a bit. _Migration_ should be fine. _Compaction_
is not.

Thanks! Very nice if at least ordinary migration works.

If you look at a function like folio_migrate_mapping(), it all seems
appropriately folio-ised. There might be something in there that is
slightly wrong, but that would just be a bug to fix, not a huge
architectural problem.

The problem comes in the callers of migrate_pages(). They pass a
new_folio_t callback. alloc_migration_target() is the usual one passed
and as far as I can tell is fine. I've seen no problems reported with it.

compaction_alloc() is a disaster, and I don't know how to fix it.
The compaction code has its own allocator which is populated with order-0
folios. How it populates that freelist is awful ... see split_map_pages()

Yeah, all that code was written under the assumption that we're moving order-0 pages (which is what the anon+pagecache pages part).

From what I recall, we're allocating order-0 pages from the high memory addresses, so we can migrate from low memory addresses, effectively freeing up low memory addresses and filling high memory addresses.

Adjusting that will be ... interesting. Instead of allocating order-0 pages from high addresses, we might want to allocate "as large as possible" ("grab what we can") from high addresses and then have our own kind of buddy for allocating from that pool a compaction destination page, depending on our source page. Nasty.

What should always work is the split->migrate. But that's definitely not what we want in many cases.

Is swapping working as expected? zswap?

Suboptimally. Swap will split folios in order to swap them. Somebody
needs to fix that, but it should work.

Good!

It would be great to have some kind of a feature matrix that tells us what works perfectly, sub-optimally, barely, not at all (and what has not been tested). Maybe (likely!) we'll also find things that are sub-optimal for ordinary THP (like swapping, not even sure about).

I suspect that KSM should work mostly fine with flexible-thp. When deduplciating, we'll simply split the compound page and proceed as expected. But might be worth testing as well.

--
Cheers,

David / dhildenb

Next message: Michael Ellerman: "[GIT PULL] Please pull powerpc/linux.git powerpc-6.5-2 tag"
Previous message: Alexey Brodkin: "Re: [PATCH 17/18] drm/arcpgu: Convert to devm_platform_ioremap_resource()"
In reply to: Matthew Wilcox: "Re: [PATCH v2 0/5] variable-order, large folios for anonymous memory"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]