Re: [PATCH v4 4/7] mm: replace vma->vm_flags direct modifications with modifier calls

From: Hyeonggon Yoo
Date: Wed Feb 01 2023 - 07:23:52 EST


On Tue, Jan 31, 2023 at 10:54:22AM -0800, Suren Baghdasaryan wrote:
> On Tue, Jan 31, 2023 at 12:32 AM Hyeonggon Yoo <42.hyeyoo@xxxxxxxxx> wrote:
> >
> > On Thu, Jan 26, 2023 at 11:37:49AM -0800, Suren Baghdasaryan wrote:
> > > Replace direct modifications to vma->vm_flags with calls to modifier
> > > functions to be able to track flag changes and to keep vma locking
> > > correctness.
> > >
> > > Signed-off-by: Suren Baghdasaryan <surenb@xxxxxxxxxx>
> > > Acked-by: Michal Hocko <mhocko@xxxxxxxx>
> > > Acked-by: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
> > > Acked-by: Mike Rapoport (IBM) <rppt@xxxxxxxxxx>
> > > Acked-by: Sebastian Reichel <sebastian.reichel@xxxxxxxxxxxxx>
> > > ---
> > > arch/arm/kernel/process.c | 2 +-
> > > 120 files changed, 188 insertions(+), 199 deletions(-)
> > >
> >
> > Hello Suren,
>
> Hi Hyeonggon,
>
> >
> > [...]
> >
> > Whoa, it's so long.
> > Mostly looks fine but two things I'm not sure about:
> >
> > > diff --git a/drivers/misc/open-dice.c b/drivers/misc/open-dice.c
> > > index 9dda47b3fd70..7be4e6c9f120 100644
> > > --- a/drivers/misc/open-dice.c
> > > +++ b/drivers/misc/open-dice.c
> > > @@ -95,12 +95,12 @@ static int open_dice_mmap(struct file *filp, struct vm_area_struct *vma)
> > > if (vma->vm_flags & VM_WRITE)
> > > return -EPERM;
> > > /* Ensure userspace cannot acquire VM_WRITE later. */
> > > - vma->vm_flags &= ~VM_MAYWRITE;
> > > + vm_flags_clear(vma, VM_MAYSHARE);
> > > }
> >
> > I think it should be:
> > s/VM_MAYSHARE/VM_MAYWRITE/
>
> Good eye! Yes, this is definitely a bug. Will post a next version with this fix.
>
> >
> > > diff --git a/mm/mlock.c b/mm/mlock.c
> > > index 5c4fff93cd6b..ed49459e343e 100644
> > > --- a/mm/mlock.c
> > > +++ b/mm/mlock.c
> > > @@ -380,7 +380,7 @@ static void mlock_vma_pages_range(struct vm_area_struct *vma,
> > > */
> > > if (newflags & VM_LOCKED)
> > > newflags |= VM_IO;
> > > - WRITE_ONCE(vma->vm_flags, newflags);
> > > + vm_flags_reset(vma, newflags);
> > >
> > > lru_add_drain();
> > > walk_page_range(vma->vm_mm, start, end, &mlock_walk_ops, NULL);
> > > @@ -388,7 +388,7 @@ static void mlock_vma_pages_range(struct vm_area_struct *vma,
> > >
> > > if (newflags & VM_IO) {
> > > newflags &= ~VM_IO;
> > > - WRITE_ONCE(vma->vm_flags, newflags);
> > > + vm_flags_reset(vma, newflags);
> > > }
> > > }
> >
> > wondering the if the comment above is still true?
> >
> > /*
> > * There is a slight chance that concurrent page migration,
> > * or page reclaim finding a page of this now-VM_LOCKED vma,
> > * will call mlock_vma_folio() and raise page's mlock_count:
> > * double counting, leaving the page unevictable indefinitely.
> > * Communicate this danger to mlock_vma_folio() with VM_IO,
> > * which is a VM_SPECIAL flag not allowed on VM_LOCKED vmas.
> > * mmap_lock is held in write mode here, so this weird
> > * combination should not be visible to other mmap_lock users;
> > * but WRITE_ONCE so rmap walkers must see VM_IO if VM_LOCKED.
> > */
> >
> > does ACCESS_PRIVATE() still guarentee that compiler cannot mysteriously
> > optimize writes like WRITE_ONCE()?
>
> I don't see ACCESS_PRIVATE() providing the same guarantees as
> WRITE_ONCE(), therefore I think this also needs to be changed. I'll
> need to introduce something like vm_flags_reset_once() and use it
> here. vm_flags_reset_once() would do WRITE_ONCE() and otherwise would
> be identical to vm_flags_reset().
>
> I'll post a new version with the fixes later today.
>
> Thanks for the review!
> Suren.

Thanks for quick reply!

Andrew's fix and the new patch looks good to me.
with these two things addressed:

Reviewed-by: Hyeonggon Yoo <42.hyeyoo@xxxxxxxxx>


Regards,
Hyeonggon