Re: [RFC][PATCH 3/6] mm: VMA sequence count

From: Kirill A. Shutemov
Date: Thu Oct 23 2014 - 11:06:13 EST


On Thu, Oct 23, 2014 at 04:22:24PM +0200, Peter Zijlstra wrote:
> On Thu, Oct 23, 2014 at 03:36:16PM +0300, Kirill A. Shutemov wrote:
> > On Wed, Oct 22, 2014 at 03:44:16PM +0200, Peter Zijlstra wrote:
> > > On Wed, Oct 22, 2014 at 02:15:54PM +0200, Peter Zijlstra wrote:
> > > > On Wed, Oct 22, 2014 at 02:53:04PM +0300, Kirill A. Shutemov wrote:
> > > > > Em, no. In this case change_protection() will not touch the pte, since
> > > > > it's pte_none() and the pte_same() check will pass just fine.
> > > >
> > > > Oh, that's what you meant. Yes that's a problem, yes vm_page_prot
> > > > needs wrapping too.
> > >
> > > Maybe also vm_policy, is there anything else that can change while a vma
> > > lives?
> >
> > - vm_flags, obviously;
>
> Do those ever change?

The flags which can change (probably incomplete):

- prot-related: VM_READ, VM_WRITE, VM_EXEC -- mprotect();
- VM_LOCKED - mlock();
- VM_SEQ_READ, VM_RAND_READ, VM_DONTCOPY, VM_DONTDUMP, VM_HUGEPAGE,
VM_NOHUGEPAGE, VM_MERGEABLE -- madvise();
- VM_SOFTDIRTY -- through procfs;

> The only thing that jumps out is the VM_LOCKED thing and that should not
> really matter one way or the other, but sure can do.

I would not be that sure about VM_LOCKED. Consider munlock() vs. write
fault race.

static int do_wp_page(struct fault_env *fe)
__releases(ptl)
{
...
err:
if (old_page) {
/*
* Don't let another task, with possibly unlocked vma,
* keep the mlocked page.
*/
if ((ret & VM_FAULT_WRITE) && (fe->vma->vm_flags & VM_LOCKED)) {
lock_page(old_page); /* LRU manipulation */
munlock_vma_page(old_page);
unlock_page(old_page);
}
page_cache_release(old_page);
}
return ret;
...
}

The page can leak out mlocked, iiuc.

Some other flags can be problematic too.

> In any case, yes I'll go include them.

I hope it will not hurt single-threaded workloads even more. :-/

--
Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/