Re: [EXTERNAL] Re: [PATCH v3] mm/thp: fix "mm: thp: kill __transhuge_page_enabled()"

From: David Hildenbrand
Date: Fri Aug 25 2023 - 08:59:52 EST


On 25.08.23 14:49, Matthew Wilcox wrote:
On Fri, Aug 25, 2023 at 09:59:23AM +0200, David Hildenbrand wrote:
Especially, we do have bigger ->huge_fault changes coming up:

https://lkml.kernel.org/r/20230818202335.2739663-1-willy@xxxxxxxxxxxxx

If the driver is not in the tree, people don't care.

You really should try upstreaming that driver.


So this patch here adds complexity (which I don't like) in order to keep an
OOT driver working -- possibly for a short time. I'm tempted to say "please
fix your driver to not use huge faults in that scenario, it is no longer
supported".

But I'm just about to vanish for 1.5 week into vacation :)

@Willy, what are your thoughts?

Fundamentally there was a bad assumption with the original patch --
it assumed that the only reason to support ->huge_fault was for DAX,
and that's not true. It's just that the only drivers in-tree which
support ->huge_fault do so in order to support DAX.

Okay, and we are willing to continue supporting that then and it's nothing we want to stop OOT drivers from doing.

Fine with me; we should probably reflect that in the patch description.


Keeping a driver out of tree is always a risky and costly proposition.
It will continue to be broken by core kernel changes, particularly
if/when it does unusual things.


Yes.

I think the complexity is entirely on us. I think there's a simpler way
to handle the problem, but I'd start by turning all of this "admin and
app get to control when THP are used" nonsense into no-ops.

Well, simpler, yes, but also more controversial :)

--
Cheers,

David / dhildenb