Re: [PATCH v3] mm/thp: fix "mm: thp: kill __transhuge_page_enabled()"

From: David Hildenbrand
Date: Thu Aug 24 2023 - 10:07:10 EST


On 24.08.23 15:59, Zach O'Keefe wrote:
On Thu, Aug 24, 2023 at 12:39 AM David Hildenbrand <david@xxxxxxxxxx> wrote:

On 22.08.23 01:48, Zach O'Keefe wrote:
The 6.0 commits:

commit 9fec51689ff6 ("mm: thp: kill transparent_hugepage_active()")
commit 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()")

merged "can we have THPs in this VMA?" logic that was previously done
separately by fault-path, khugepaged, and smaps "THPeligible" checks.

During the process, the semantics of the fault path check changed in two
ways:

1) A VM_NO_KHUGEPAGED check was introduced (also added to smaps path).
2) We no longer checked if non-anonymous memory had a vm_ops->huge_fault
handler that could satisfy the fault. Previously, this check had been
done in create_huge_pud() and create_huge_pmd() routines, but after
the changes, we never reach those routines.

During the review of the above commits, it was determined that in-tree
users weren't affected by the change; most notably, since the only relevant
user (in terms of THP) of VM_MIXEDMAP or ->huge_fault is DAX, which is
explicitly approved early in approval logic. However, there is at least
one occurrence where an out-of-tree driver that used
VM_HUGEPAGE|VM_MIXEDMAP with a vm_ops->huge_fault handler, was broken.

... so all we did is break an arbitrary out-of-tree driver? Sorry to
say, but why should we care?

Is there any in-tree code affected and needs a "Fixes:" ?

The in-tree code was taken care of during the rework .. but I didn't
know about the possibility of a driver hooking in here.

And that's the problem of the driver, no? It's not the job of the kernel developers to be aware of each out-of-tree driver to not accidentally break something in there.


I don't know what the normal policy / stance here is, but I figured
the change was simple enough that it was worth helping out.

If you decide to be out-of-tree, then you have be prepared to only support tested kernels and fix your driver when something changes upstream -- like upstream developers would do for you when it would be in-tree.

So why can't the out-of-tree driver be fixed, similarly to how we would have fixed it if it would be in-tree?

--
Cheers,

David / dhildenb