Re: [RFC PATCH 00/26] mm: reliable huge page allocator

From: Matthew Wilcox
Date: Fri Apr 21 2023 - 13:15:12 EST


On Fri, Apr 21, 2023 at 05:11:56PM +0100, Mel Gorman wrote:
> It was considered once upon a time and comes up every so often as variants
> of a "sticky" pageblock pageblock bit that prevents mixing. The risks was
> ending up in a context where memory within a suitable pageblock cannot
> be freed and all of the available MOVABLE pageblocks have at least one
> pinned page that cannot migrate from the allocating context. It can also
> potentially hit a case where the majority of memory is UNMOVABLE pageblocks,
> each of which has a single pagetable page that cannot be freed without an
> OOM kill. Variants of issues like this would manifestas an OOM kill with
> plenty of memory free bug or excessive CPu usage on reclaim or compaction.
>
> It doesn't kill the idea of the series at all but it puts a lot of emphasis
> in splitting the series by low-risk and high-risk. Maybe to the extent where
> the absolute protection against mixing can be broken in OOM situations,
> kernel command line or sysctl.

Has a variant been previously considered where MOVABLE allocations are
allowed to come from UNMOVABLE blocks? After all, MOVABLE allocations
are generally, well, movable. So an UNMOVABLE allocation could try to
migrate pages from a MIXED pageblock in order to turn the MIXED pageblock
back into an UNMOVABLE pageblock.

This might work better in practice because GFP_NOFS allocations tend
to also be MOVABLE, so allowing them to take up some of the UNMOVABLE
space temporarily feels like a get-out-of-OOM card.

(I've resisted talking about plans to make page table pages movable
because I don't think that's your point; that's just an example of a
currently-unmovable allocation, right?)

I mention this in part because on my laptop, ZONE_DMA is almost unused:

Node 0, zone DMA 0 0 0 0 0 0 0 0 1 2 2
Node 0, zone DMA32 1685 1345 1152 554 424 212 104 40 2 0 0
Node 0, zone Normal 6959 3530 1893 1862 629 483 107 10 0 0 0

That's 2 order-10 (=8MB), 2 order-9 (=4MB) and 1 order8 (=1MB) for a
total of 13MB of memory. That's insignificant to a 16GB laptop, but on
smaller machines, it might be worth allowing MOVABLE allocations to come
from ZONE_DMA on the grounds that they can be easily freed if anybody
ever allocated from ZONE_DMA.