Re: [EXTERNAL] Re: [PATCH v3] mm/thp: fix "mm: thp: kill __transhuge_page_enabled()"

From: David Hildenbrand
Date: Fri Aug 25 2023 - 04:01:14 EST


On 24.08.23 17:39, Saurabh Singh Sengar wrote:
On Thu, Aug 24, 2023 at 7:05 AM David Hildenbrand <david@xxxxxxxxxx>
wrote:

On 24.08.23 15:59, Zach O'Keefe wrote:
On Thu, Aug 24, 2023 at 12:39 AM David Hildenbrand
<david@xxxxxxxxxx> wrote:

On 22.08.23 01:48, Zach O'Keefe wrote:
The 6.0 commits:

commit 9fec51689ff6 ("mm: thp: kill
transparent_hugepage_active()") commit 7da4e2cb8b1f ("mm: thp:
kill __transhuge_page_enabled()")

merged "can we have THPs in this VMA?" logic that was previously
done separately by fault-path, khugepaged, and smaps "THPeligible"
checks.

During the process, the semantics of the fault path check changed
in two
ways:

1) A VM_NO_KHUGEPAGED check was introduced (also added to smaps
path).
2) We no longer checked if non-anonymous memory had a vm_ops-
huge_fault
handler that could satisfy the fault. Previously, this check had been
done in create_huge_pud() and create_huge_pmd() routines, but
after
the changes, we never reach those routines.

During the review of the above commits, it was determined that
in-tree users weren't affected by the change; most notably, since
the only relevant user (in terms of THP) of VM_MIXEDMAP or
->huge_fault is DAX, which is explicitly approved early in
approval logic. However, there is at least one occurrence where
an out-of-tree driver that used VM_HUGEPAGE|VM_MIXEDMAP with a
vm_ops->huge_fault handler, was broken.

... so all we did is break an arbitrary out-of-tree driver? Sorry
to say, but why should we care?

Is there any in-tree code affected and needs a "Fixes:" ?

The in-tree code was taken care of during the rework .. but I didn't
know about the possibility of a driver hooking in here.

And that's the problem of the driver, no? It's not the job of the
kernel developers to be aware of each out-of-tree driver to not
accidentally break something in there.


I don't know what the normal policy / stance here is, but I figured
the change was simple enough that it was worth helping out.

If you decide to be out-of-tree, then you have be prepared to only
support tested kernels and fix your driver when something changes
upstream -- like upstream developers would do for you when it would be
in-tree.

So why can't the out-of-tree driver be fixed, similarly to how we
would have fixed it if it would be in-tree?

I don't know much about driver development, but perhaps they are / need to
use a pristine upstream kernel, with their driver as a loadable kernel module.
Saurabh can comment on this, I don't know.


Hi!

You are correct Zach. The "out-of-tree" driver had been seamlessly operational
on version 5.19, leveraging the kernel's capability to handle huge faults for
non-anonymous vma. However, the transition to kernel version 6.1 inadvertently
led to the removal of this feature. It's important to note that this removal wasn't
intentional, and I am optimistic about the potential restoration of this feature.

We are currently creating 6.5, and are being told that a patch in 6.0 (released almost one year ago!) broke an out-of-tree driver.

Being that back-level, you cannot possibly expect that the upstream community can seriously care about not breaking your OOT driver in each release.

Especially, we do have bigger ->huge_fault changes coming up:

https://lkml.kernel.org/r/20230818202335.2739663-1-willy@xxxxxxxxxxxxx

If the driver is not in the tree, people don't care.

You really should try upstreaming that driver.


So this patch here adds complexity (which I don't like) in order to keep an OOT driver working -- possibly for a short time. I'm tempted to say "please fix your driver to not use huge faults in that scenario, it is no longer supported".

But I'm just about to vanish for 1.5 week into vacation :)

@Willy, what are your thoughts?

In any case, I think we should drop the "Fixes" tag. This does not fix any kernel BUG -- it restores compatibility with an OOT driver -- and already confused people building distributions and asking about this fix ;)



Hello David,

Given the context, I am currently exploring potential ways to address the issue
with the "out-of-tree" driver. I have recognized a challenge posed by the kernel's
memory management framework, which now imposes restrictions on huge faults
for non-anonymous vma.

You should try upstreaming your driver possibly without huge fault support, and then separately try re-adding huge fault support and see if kernel people want to support that or not.

And if your driver *really* requires huge faults, then supporting that would be part of your series when upstreaming that driver.

--
Cheers,

David / dhildenb