Re: [PATCH v3 15/15] mm/mmap: Change vma iteration order in do_vmi_align_munmap()

From: Jann Horn
Date: Mon Aug 14 2023 - 11:45:06 EST


@akpm

On Mon, Jul 24, 2023 at 8:31 PM Liam R. Howlett <Liam.Howlett@xxxxxxxxxx> wrote:
> Since prev will be set later in the function, it is better to reverse
> the splitting direction of the start VMA (modify the new_below argument
> to __split_vma).

It might be a good idea to reorder "mm: always lock new vma before
inserting into vma tree" before this patch.

If you apply this patch without "mm: always lock new vma before
inserting into vma tree", I think move_vma(), when called with a start
address in the middle of a VMA, will behave like this:

- vma_start_write() [lock the VMA to be moved]
- move_page_tables() [moves page table entries]
- do_vmi_munmap()
- do_vmi_align_munmap()
- __split_vma()
- creates a new VMA **covering the moved range** that is **not locked**
- stores the new VMA in the VMA tree **without locking it** [1]
- new VMA is locked and removed again [2]
[...]

So after the page tables in the region have already been moved, I
believe there will be a brief window (between [1] and [2]) where page
faults in the region can happen again, which could probably cause new
page tables and PTEs to be created in the region again in that window.
(This can't happen in Linus' current tree because the new VMA created
by __split_vma() only covers the range that is not being moved.)

Though I guess that's not going to lead to anything bad, since
do_vmi_munmap() anyway cleans up PTEs and page tables in the region?
So maybe it's not that important.