Re: [PATCH v3 2/2] mm: disable CONFIG_PER_VMA_LOCK until its fixed

From: David Hildenbrand
Date: Wed Jul 05 2023 - 16:38:23 EST


On 05.07.23 22:25, Peter Xu wrote:
On Wed, Jul 05, 2023 at 10:22:27AM -0700, Suren Baghdasaryan wrote:
On Wed, Jul 5, 2023 at 10:16 AM David Hildenbrand <david@xxxxxxxxxx> wrote:

On 05.07.23 19:12, Suren Baghdasaryan wrote:
A memory corruption was reported in [1] with bisection pointing to the
patch [2] enabling per-VMA locks for x86.
Disable per-VMA locks config to prevent this issue while the problem is
being investigated. This is expected to be a temporary measure.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=217624
[2] https://lore.kernel.org/all/20230227173632.3292573-30-surenb@xxxxxxxxxx

Reported-by: Jiri Slaby <jirislaby@xxxxxxxxxx>
Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@xxxxxxxxxx/
Reported-by: Jacob Young <jacobly.alt@xxxxxxxxx>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217624
Fixes: 0bff0aaea03e ("x86/mm: try VMA lock-based page fault handling first")
Cc: stable@xxxxxxxxxxxxxxx
Signed-off-by: Suren Baghdasaryan <surenb@xxxxxxxxxx>
---
mm/Kconfig | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/Kconfig b/mm/Kconfig
index 09130434e30d..0abc6c71dd89 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -1224,8 +1224,9 @@ config ARCH_SUPPORTS_PER_VMA_LOCK
def_bool n

config PER_VMA_LOCK
- def_bool y
+ bool "Enable per-vma locking during page fault handling."
depends on ARCH_SUPPORTS_PER_VMA_LOCK && MMU && SMP
+ depends on BROKEN
help
Allow per-vma locking during page fault handling.

Do we have any testing results (that don't reveal other issues :) ) for
patch #1? Not sure if we really want to mark it broken if patch #1 fixes
the issue.

I tested the fix using the only reproducer provided in the reports
plus kernel compilation and my fork stress test. All looked good and
stable but I don't know if other reports had the same issue or
something different.

The commit log seems slightly confusing. It mostly says the bug was still
not solved, but I assume patch 1 is the current "fix", it's just not clear
whether there's any other potential issues?

According to the stable tree rules:

- It must fix a problem that causes a build error (but not for things
marked CONFIG_BROKEN), an oops, a hang, data corruption, a real
security issue, or some "oh, that's not good" issue. In short, something
critical.

I think it means vma lock will never be fixed in 6.4, and it can't (because
after this patch it'll be BROKEN, and this patch copies stable, and we
can't fix BROKEN things in stables).

Totally no problem I see, just to make sure this is what you wanted..

There'll still try to be a final fix, am I right? As IIRC allowing page
faults during fork() is one of the major goals of vma lock.

At least not that I am aware of (and people who care about that should really work on scalable fork() alternatives, like that io_uring fork() thingy).

My understanding is that CONFIG_PER_VMA_LOCK wants to speed up page concurrent page faults *after* fork() [or rather, after new process creation], IOW, when we have a lot of mmap() activity going on while some threads of the new process are already active and don't actually touch what's getting newly mmaped.

--
Cheers,

David / dhildenb