Re: [PATCH 1/1] mm: lock VMAs skipped by a failed queue_pages_range()

From: Michal Hocko
Date: Tue Sep 19 2023 - 04:54:20 EST


On Mon 18-09-23 14:16:08, Suren Baghdasaryan wrote:
> When queue_pages_range() encounters an unmovable page, it terminates
> its page walk. This walk, among other things, locks the VMAs in the range.
> This termination might result in some VMAs being left unlock after
> queue_pages_range() completes. Since do_mbind() continues to operate on
> these VMAs despite the failure from queue_pages_range(), it will encounter
> an unlocked VMA.
> This mbind() behavior has been modified several times before and might
> need some changes to either finish the page walk even in the presence
> of unmovable pages or to error out immediately after the failure to
> queue_pages_range(). However that requires more discussions, so to
> fix the immediate issue, explicitly lock the VMAs in the range if
> queue_pages_range() failed. The added condition does not save much
> but is added for documentation purposes to understand when this extra
> locking is needed.

The semantic of the walk in this case is really clear as mud. I was
trying to reconstruct the whole picture and it really hurts... Then I
found http://lkml.kernel.org/r/CAHbLzkrmTaqBRmHVdE2kyW57Uoghqd_E+jAXC9cB5ofkhL-uvw@xxxxxxxxxxxxxx
and that helped a lot. Let's keep it a reference at least in the email
thread here for future.

> Fixes: 49b0638502da ("mm: enable page walking API to lock vmas during the walk")
> Reported-by: syzbot+b591856e0f0139f83023@xxxxxxxxxxxxxxxxxxxxxxxxx
> Closes: https://lore.kernel.org/all/000000000000f392a60604a65085@xxxxxxxxxx/
> Signed-off-by: Suren Baghdasaryan <surenb@xxxxxxxxxx>

I cannot say I like the patch (it looks like a potential double locking
unless you realize this lock is special) but considering this might be just
temporal I do not mind.

Acked-by: Michal Hocko <mhocko@xxxxxxxx>

Thanks!

> ---
> mm/mempolicy.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 42b5567e3773..cbc584e9b6ca 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1342,6 +1342,9 @@ static long do_mbind(unsigned long start, unsigned long len,
> vma_iter_init(&vmi, mm, start);
> prev = vma_prev(&vmi);
> for_each_vma_range(vmi, vma, end) {
> + /* If queue_pages_range failed then not all VMAs might be locked */
> + if (ret)
> + vma_start_write(vma);
> err = mbind_range(&vmi, vma, &prev, start, end, new);
> if (err)
> break;
> --
> 2.42.0.459.ge4e396fd5e-goog

--
Michal Hocko
SUSE Labs