Re: [patch 0/4] [RFC] MMU Notifiers V1

From: Andrea Arcangeli
Date: Fri Jan 25 2008 - 06:42:59 EST


On Thu, Jan 24, 2008 at 09:56:06PM -0800, Christoph Lameter wrote:
> Andrea's mmu_notifier #4 -> RFC V1
>
> - Merge subsystem rmap based with Linux rmap based approach
> - Move Linux rmap based notifiers out of macro
> - Try to account for what locks are held while the notifiers are
> called.
> - Develop a patch sequence that separates out the different types of
> hooks so that it is easier to review their use.
> - Avoid adding #include to linux/mm_types.h
> - Integrate RCU logic suggested by Peter.

I'm glad you're converging on something a bit saner and much much
closer to my code, plus perfectly usable by KVM optimal rmap design
too. It would have preferred if you would have sent me patches like
Peter did for review and merging etc... that would have made review
especially easier. Anyway I'm used to that on lkml so it's ok, I just
need this patch to be included in mainline, everything else is
irrelevant to me.

On a technical merit this still partially makes me sick and I think
it's the last issue to debate.

@@ -971,6 +974,9 @@ int try_to_unmap(struct page *page, int
else
ret = try_to_unmap_file(page, migration);

+ if (unlikely(PageExternalRmap(page)))
+ mmu_rmap_notifier(invalidate_page, page);
+
if (!page_mapped(page))
ret = SWAP_SUCCESS;
return ret;

I find the above hard to accept, because the moment you work with
physical pages and not "mm+address" I think you couldn't possibly care
if page_mapped is true or false, and I think the above notifier should
be called _outside_ try_to_unmap. Infact I'd call
mmu_rmap_notifier(invalidate_page, page); only if page_unmapped is
false and the linux pte is gone already (practically just before the
page_count == 2 check and after try_to_unmap).

I also think it's still worth to debate the rmap based on virtual or
physical index. By supporting both secondary-rmap designs at the same
time you seem to agree current KVM lightweight rmap implementation is
a superior design at least for KVM. But by insisting on your rmap
based on physical for your usage, you're implicitly telling us that is
a superior design for you. But we know very little of why you can't
exactly build rmap on virtual like KVM does! (especially now that you
implicitly admitted KVM rmap design is superior at least for KVM it'd
be interesting to know why you can't do the same exactly) You said
something on that, but I certainly don't have a clear picture of why
it can't work or why it would be less efficient.

Like you said by PM I'd also like comments from Hugh, Nick and others
about this issue.

Nevertheless I'm very glad we already fully converged on the
set_page_dirty, invalidate-page after ptep_clear_flush/young,
etc... and furthermore that you only made very minor modification to
my code to add a pair of hooks for the page-based rmap notifiers on
top of my patch. So from a functionality POV this is 100% workable
already from KVM side!

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/