Re: [PATCH v3 2/2] mm: disable CONFIG_PER_VMA_LOCK until its fixed

From: Suren Baghdasaryan
Date: Wed Jul 05 2023 - 17:09:49 EST


On Wed, Jul 5, 2023 at 1:37 PM David Hildenbrand <david@xxxxxxxxxx> wrote:
>
> On 05.07.23 22:25, Peter Xu wrote:
> > On Wed, Jul 05, 2023 at 10:22:27AM -0700, Suren Baghdasaryan wrote:
> >> On Wed, Jul 5, 2023 at 10:16 AM David Hildenbrand <david@xxxxxxxxxx> wrote:
> >>>
> >>> On 05.07.23 19:12, Suren Baghdasaryan wrote:
> >>>> A memory corruption was reported in [1] with bisection pointing to the
> >>>> patch [2] enabling per-VMA locks for x86.
> >>>> Disable per-VMA locks config to prevent this issue while the problem is
> >>>> being investigated. This is expected to be a temporary measure.
> >>>>
> >>>> [1] https://bugzilla.kernel.org/show_bug.cgi?id=217624
> >>>> [2] https://lore.kernel.org/all/20230227173632.3292573-30-surenb@xxxxxxxxxx
> >>>>
> >>>> Reported-by: Jiri Slaby <jirislaby@xxxxxxxxxx>
> >>>> Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@xxxxxxxxxx/
> >>>> Reported-by: Jacob Young <jacobly.alt@xxxxxxxxx>
> >>>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217624
> >>>> Fixes: 0bff0aaea03e ("x86/mm: try VMA lock-based page fault handling first")
> >>>> Cc: stable@xxxxxxxxxxxxxxx
> >>>> Signed-off-by: Suren Baghdasaryan <surenb@xxxxxxxxxx>
> >>>> ---
> >>>> mm/Kconfig | 3 ++-
> >>>> 1 file changed, 2 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/mm/Kconfig b/mm/Kconfig
> >>>> index 09130434e30d..0abc6c71dd89 100644
> >>>> --- a/mm/Kconfig
> >>>> +++ b/mm/Kconfig
> >>>> @@ -1224,8 +1224,9 @@ config ARCH_SUPPORTS_PER_VMA_LOCK
> >>>> def_bool n
> >>>>
> >>>> config PER_VMA_LOCK
> >>>> - def_bool y
> >>>> + bool "Enable per-vma locking during page fault handling."
> >>>> depends on ARCH_SUPPORTS_PER_VMA_LOCK && MMU && SMP
> >>>> + depends on BROKEN
> >>>> help
> >>>> Allow per-vma locking during page fault handling.
> >>>>
> >>> Do we have any testing results (that don't reveal other issues :) ) for
> >>> patch #1? Not sure if we really want to mark it broken if patch #1 fixes
> >>> the issue.
> >>
> >> I tested the fix using the only reproducer provided in the reports
> >> plus kernel compilation and my fork stress test. All looked good and
> >> stable but I don't know if other reports had the same issue or
> >> something different.
> >
> > The commit log seems slightly confusing. It mostly says the bug was still
> > not solved, but I assume patch 1 is the current "fix", it's just not clear
> > whether there's any other potential issues?
> >
> > According to the stable tree rules:
> >
> > - It must fix a problem that causes a build error (but not for things
> > marked CONFIG_BROKEN), an oops, a hang, data corruption, a real
> > security issue, or some "oh, that's not good" issue. In short, something
> > critical.
> >
> > I think it means vma lock will never be fixed in 6.4, and it can't (because
> > after this patch it'll be BROKEN, and this patch copies stable, and we
> > can't fix BROKEN things in stables).
> >
> > Totally no problem I see, just to make sure this is what you wanted..
> >
> > There'll still try to be a final fix, am I right? As IIRC allowing page
> > faults during fork() is one of the major goals of vma lock.
>
> At least not that I am aware of (and people who care about that should
> really work on scalable fork() alternatives, like that io_uring fork()
> thingy).
>
> My understanding is that CONFIG_PER_VMA_LOCK wants to speed up page
> concurrent page faults *after* fork() [or rather, after new process
> creation], IOW, when we have a lot of mmap() activity going on while
> some threads of the new process are already active and don't actually
> touch what's getting newly mmaped.

Getting as much concurrency as we can is the goal. If we can allow
some page faults during fork, I would take that too. But for now let's
deploy the simplest and safest approach.

>
> --
> Cheers,
>
> David / dhildenb
>