Re: [PATCH v4] mm: per-thread vma caching

From: Oleg Nesterov
Date: Sun Mar 09 2014 - 08:58:49 EST


On 03/08, Linus Torvalds wrote:
>
> On Sat, Mar 8, 2014 at 11:44 AM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> >
> > Sure. But another thread or CLONE_VM task can do vmacache_invalidate(),
> > hit vmacache_seqnum == 0 and call vmacache_flush_all() to solve the
> > problem with potential overflow.
>
> How?
>
> Any invalidation is supposed to hold the mm semaphore for writing.

Yes,

> And
> we should have it for reading.

No, dup_task_struct() is obviously lockless. And the new child is not yet
visible to for_each_process_thread().

clone(CLONE_VM) can create a thread with the corrupted vmacache.




OK. Suppose we have a task T1 which has the valid vmacache,
T1->vmacache_seqnum == T1->mm->vmacache_seqnum == 0. Suppose it sleeps a lot.

Suppose that its subthread T2 does a lot munmap's, finally mm->vmacache_seqnum
becomes zero again and T2 calls vmacache_flush_all().

T1 wakes up and does clone(CLONE_VM). The new thread T3 gets the copy
of T2's ->vmacache_seqnum and ->vmacache[].

T2 continues, vmacache_flush_all() finds T1 and does vmacache_flush(T1).

But the new thread T3 is not on the list yet, vmacache_flush_all() can't
find it.

So T3 will run with vmacache_valid() == T (till the next invalidate(mm)
of course) but its ->vmacache[] points to nowhere.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/