Re: [PATCH] lazy freeing of memory through MADV_FREE

From: Nick Piggin
Date: Mon Apr 23 2007 - 06:36:13 EST


Rik van Riel wrote:
Nick Piggin wrote:

It looks like the tlb flushes (and IPIs) from zap_pte_range()
could have been the problem. They're gone now.


I guess it is a good idea to batch these things. But can you
do that on all architectures? What happens if your tlb flush
happens after another thread already accesses it again, or
after it subsequently gets removed from the address space via
another CPU?


I have thought about this a lot tonight, and have come to the conclusion
that they are ok.

The reason is simple:

1) we do the TLB flush before we return from the
madvise(MADV_FREE) syscall.

2) anything that accessess the pages between the start
and end of the MADV_FREE procedure does not know in
which order we go through the pages, so it could hit
a page either before or after we get to processing
it

3) because of this, we can treat any such accesses as
happening simultaneously with the MADV_FREE and
as illegal, aka undefined behaviour territory and
we do not need to worry about them

Yes, but I'm wondering if it is legal in all architectures.


4) because we flush the tlb before releasing the page
table lock, other CPUs cannot remove this page from
the address space - they will block on the page
table lock before looking at this pte

We don't when the ptl is split.

What the tlb flush used to be able to assume is that the page
has been removed from the pagetables when they are put in the
tlb flush batch.

I'm not saying there is any bugs, but just suggesting there
might be.

--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/