Re: [RFC PATCH] s390: mm: rmap: Transfer storage key to struct pageunder the page lock

From: Mel Gorman
Date: Tue Apr 17 2012 - 08:29:26 EST


On Mon, Apr 16, 2012 at 05:50:40PM +0200, Martin Schwidefsky wrote:
> On Mon, 16 Apr 2012 15:14:23 +0100
> Mel Gorman <mgorman@xxxxxxx> wrote:
>
> > This patch is horribly ugly and there has to be a better way of doing
> > it. I'm looking for suggestions on what s390 can do here that is not
> > painful or broken.
> >
> > However, s390 needs a better way of guarding against
> > PageSwapCache pages being removed from the radix tree while set_page_dirty()
> > is being called. The patch would be marginally better if in the PageSwapCache
> > case we simply tried to lock once and in the contended case just fail to
> > propogate the storage key. I lack familiarity with the s390 architecture
> > to be certain if this is safe or not. Suggestions on a better fix?
>
> One though that crossed my mind is that maybe a better approach would be
> to move the page_test_and_clear_dirty check out of page_remove_rmap.
> What we need to look out for are code sequences of the form:
>
> if (pte_dirty(pte))
> set_page_dirty(page);
> ...
> page_remove_rmap(page);
>
> There are four of those as far as I can see: in try_to_unmap_one,
> try_to_unmap_cluster, zap_pte, and zap_pte_range.
>
> A valid implementation for s390 would be to test and clear the changed
> bit in the storage key for every of those pte_dirty() calls.
>
> if (pte_dirty(pte) || page_test_and_clear_dirty(page))
> set_page_dirty(page);
> ...
> page_remove_rmap(page); /* w/o page_test_clear_dirty */
>

In the zap_pte_range() case at least, pte_dirty() is only being checked
for !PageAnon pages so if we took this approach we would miss
PageSwapCache pages. If we added the check then the same problem is hit
and we'd need additional logic there for s390 to drop the PTL, take the
page lock and retry the operation. It'd still be ugly :(

--
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/