Re: inconsistence in mprotect_fixup mlock_fixup madvise_update_vma

From: Liam R. Howlett
Date: Tue Jun 13 2023 - 21:19:01 EST


* Jeff Xu <jeffxu@xxxxxxxxxxxx> [230613 17:29]:
> Hello Peter,
>
> Thanks for responding.
>
> On Tue, Jun 13, 2023 at 1:16 PM Peter Xu <peterx@xxxxxxxxxx> wrote:
> >
> > Hi, Jeff,
> >
> > On Tue, Jun 13, 2023 at 08:26:26AM -0700, Jeff Xu wrote:
> > > + more ppl to the list.
> > >
> > > On Mon, Jun 12, 2023 at 6:04 PM Jeff Xu <jeffxu@xxxxxxxxxxxx> wrote:
> > > >
> > > > Hello,
> > > >
> > > > There seems to be inconsistency in different VMA fixup
> > > > implementations, for example:
> > > > mlock_fixup will skip VMA that is hugettlb, etc, but those checks do
> > > > not exist in mprotect_fixup and madvise_update_vma. Wouldn't this be a
> > > > problem? the merge/split skipped by mlock_fixup, might get acted on in
> > > > the madvice/mprotect case.
> > > >
> > > > mlock_fixup currently check for
> > > > if (newflags == oldflags ||

newflags == oldflags, then we don't need to do anything here, it's
already at the desired mlock. mprotect does this, madvise does this..
probably.. it's ugly.

> > > > (oldflags & VM_SPECIAL) ||

It's special, merging will fail always. I don't know about splitting,
but I guess we don't want to alter the mlock state on special mappings.

> > > > is_vm_hugetlb_page(vma) || vma == get_gate_vma(current->mm) ||
> > > > vma_is_dax(vma) || vma_is_secretmem(vma))
> >
> > The special handling you mentioned in mlock_fixup mostly makes sense to me.
> >
> > E.g., I think we can just ignore mlock a hugetlb page if it won't be
> > swapped anyway.
> >
> > Do you encounter any issue with above?
> >
> > > > Should there be a common function to handle VMA merge/split ?
> >
> > IMHO vma_merge() and split_vma() are the "common functions". Copy Lorenzo
> > as I think he has plan to look into the interface to make it even easier to
> > use.
> >
> The mprotect_fixup doesn't have the same check as mlock_fixup. When
> userspace calls mlock(), two VMAs might not merge or split because of
> vma_is_secretmem check, However, when user space calls mprotect() with
> the same address range, it will merge/split. If mlock() is doing the
> right thing to merge/split the VMAs, then mprotect() is not ?

It looks like secretmem is mlock'ed to begin with so they don't want it
to be touched. So, I think they will be treated differently and I think
it is correct.

Although, it would have been nice to have the comment above the function
kept up to date on why certain VMAs are filtered out.

>
> Also skipping merge of VMA might be OK, but skipping split doesn't,
> wouldn't that cause inconsistent between vma->vm_flags and what is
> provisioned in the page ?

I don't quite follow what you mean. It seems like the mlock_fixup() is
skipped when we don't want the flag to be altered on a particular VMA.
Where do they get out of sync?