Re: [PATCH 2/4] Add replace_page(), change the mapping of pte fromone page into another

From: Andrea Arcangeli
Date: Wed Nov 12 2008 - 21:01:25 EST


On Wed, Nov 12, 2008 at 05:09:03PM -0500, Lee Schermerhorn wrote:
> Maybe not so wild, given the complexity of these interactions...

Perhaps Christoph's right it's just wild ideas, but see below.

You both seem to agree the first theory of the tree_lock is bogus
as it's lockless for find_get_page.

The second theory of why it shouldn't happen thanks to the refcount
freezing is bogus too...

CPU0 migrate.c CPU1 filemap.c
------- ----------
find_get_page
radix_tree_lookup_slot returns the oldpage
page_count still = expected_count
freeze_ref (oldpage->count = 0)
radix_tree_replace (too late, other side already got the oldpage)
unfreeze_ref (oldpage->count = 2)
page_cache_get_speculative(old_page)
set count to 3 and succeeds

Admittedly I couldn't understand what the freeze_ref was about, I
thought it was something related to the radix tree internals (which I
didn't check as they weren't relevant at that point besides being
lockless) as there was nothing inside that freeze/unfreeze critical
section that could affect find_get_page, so I ignored it. If if was
meant to stop find_get_page to get the oldpage it clearly isn't
working.

Perhaps I'm still missing something...

If I'm right my suggested fix is to simply change the
remove_migration_ptes to set the pte to point to the swapcache,
instead of leaving the swapentry in it. That will make do_swap_page
bailout like every other path in memory.c in the pte_same check, and
additionally it'll avoid an unnecessary minor fault.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/