Re: [PATCH v1 06/11] mm: support GUP-triggered unsharing via FAULT_FLAG_UNSHARE (!hugetlb)

From: Matthew Wilcox
Date: Sun Dec 19 2021 - 16:48:12 EST


On Sun, Dec 19, 2021 at 01:27:07PM -0800, Linus Torvalds wrote:
> On Sun, Dec 19, 2021 at 1:12 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
> >
> > Can we get rid of ->mapcount altogether? Three states:
> > - Not mapped
> > - Mapped exactly once
> > - Possibly mapped more than once
>
> I don't think even that is useful. We should get rid of mapcount entirely.
>
> It doesn't actually help to know "mapped exactly once", exactly
> because even when that's true, there may be non-mapped references to
> the page.
>
> Mapped references really aren't that special in general.
>
> One case where it *can* be special is on virtually indexed cache
> architectures, where "is this mapped anywhere else" can be an issue
> for cache flushing.
>
> There the page_mapcount() can actually really matter, but it's such an
> odd case that I'm not convinced it should be something the kernel VM
> code should bend over backwards for.
>
> And the count could be useful for 'rmap' operations, where you can
> stop walking the rmap once you've found all mapped cases (paghe
> migration being one case of this). But again, I'm not convinced the
> serialization is worth it, or that it's a noticeable win.
>
> However, I'm not 100% convinced it's worth it even there, and I'm not
> sure we necessarily use it there.
>
> So in general, I think page_mapcount() can be useful as a count for
> those things that are _literally_ about "where is this page mapped".
> Page migration, virtual cache coherency, things like that can
> literally be about "how many different virtual mappings can we find".
>
> It's just that pages can have a number of non-mapped users too, so
> mapcount isn't all that meaningful in general.
>
> And you can look it up with rmap too, and so I do think it would be
> worth discussing whether we really should strive to maintain
> 'mapcount' at all.

Yes, agreed, I was thinking that we could use "not mapped at all"
as an optimisation to avoid doing rmap walks. eg __unmap_and_move().

Perhaps more interestingly in truncate_cleanup_page():
if (page_mapped(page))
unmap_mapping_page(page);
where we can skip the i_mmap rbtree walk if we know the page isn't
mapped. I'd be willing to give up that optimisation if we had "this
page was never mapped" (ie if page_mapped() was allowed to return
false positives).