Re: [PATCH 02/11] mm,migration: Do not try to migrate unmapped anonymous pages

From: Minchan Kim
Date: Mon Mar 15 2010 - 03:11:38 EST


On Mon, Mar 15, 2010 at 3:44 PM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@xxxxxxxxxxxxxx>
>> Thanks for detail explanation, Kame.
>> But it can't understand me enough, Sorry.
>>
>> Mel said he met "use-after-free errors in anon_vma".
>> So added the check in unmap_and_move.
>>
>> if (PageAnon(page)) {
>> Â....
>> Âif (!page_mapcount(page))
>> Â Âgoto uncharge;
>> Ârcu_read_lock();
>>
>> My concern what protects racy mapcount of the page?
>> For example,
>>
>> CPU A Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â CPU B
>> unmap_and_move
>> page_mapcount check pass  Âzap_pte_range
>> <-- some stall --> Â Â Â Â Â Â Â Â Â pte_lock
>> <-- some stall --> Â Â Â Â Â Â Â Â Â page_remove_rmap(map_count is zero!)
>> <-- some stall --> Â Â Â Â Â Â Â Â Â pte_unlock
>> <-- some stall --> Â Â Â Â Â Â Â Â Â anon_vma_unlink
>> <-- some stall --> Â Â Â Â Â Â Â Â Â anon_vma free !!!!
>> rcu_read_lock
>> anon_vma has gone!!
>>
>> I think above scenario make error "use-after-free", again.
>> What prevent above scenario?
>>
> I think this patch is not complete.
> I guess this patch in [1/11] is trigger for the race.
> ==
> +
> + Â Â Â /* Drop an anon_vma reference if we took one */
> + Â Â Â if (anon_vma && atomic_dec_and_lock(&anon_vma->migrate_refcount, &anon_vma->lock)) {
> + Â Â Â Â Â Â Â int empty = list_empty(&anon_vma->head);
> + Â Â Â Â Â Â Â spin_unlock(&anon_vma->lock);
> + Â Â Â Â Â Â Â if (empty)
> + Â Â Â Â Â Â Â Â Â Â Â anon_vma_free(anon_vma);
> + Â Â Â }
> ==
> If my understainding in above is correct, this "modify" freed anon_vma.
> Then, use-after-free happens. (In old implementation, there are no refcnt,
> so, there is no use-after-free ops.)
>

I agree.
Let's wait Mel's response.

>
> So, what I can think of now is a patch like following is necessary.
>
> ==
> static inline struct anon_vma *anon_vma_alloc(void)
> {
> Â Â Â Âstruct anon_vma *anon_vma;
> Â Â Â Âanon_vma = kmem_cache_alloc(anon_vma_cachep, GFP_KERNEL);
> Â Â Â Âatomic_set(&anon_vma->refcnt, 1);
> }
>
> void anon_vma_free(struct anon_vma *anon_vma)
> {
> Â Â Â Â/*
> Â Â Â Â * This called when anon_vma is..
> Â Â Â Â * - anon_vma->vma_list becomes empty.
> Â Â Â Â * - incremetned refcnt while migration, ksm etc.. is dropped.
> Â Â Â Â * - allocated but unused.
> Â Â Â Â */
> Â Â Â Âif (atomic_dec_and_test(&anon_vma->refcnt))
> Â Â Â Â Â Â Â Âkmem_cache_free(anon_vma_cachep, anon_vma);
> }
> ==
> Then all things will go simple.
> Overhead is concern but list_empty() helps us much.

When they made things complicated without atomic_op,
there was reasonable reason, I think. :)

My opinion depends on you and server guys(Hugh, Rik, Andrea Arcangeli and so on)


>
> Thanks,
> -Kame
>
>
>
>
>



--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/