Re: [PATCH] drop_caches: Allow unmapping pages

From: Matthew Wilcox
Date: Mon Jan 07 2019 - 09:15:50 EST


On Mon, Jan 07, 2019 at 02:02:39PM +0100, Vincent Whitchurch wrote:
> +++ b/Documentation/sysctl/vm.txt
> @@ -222,6 +222,10 @@ To increase the number of objects freed by this operation, the user may run
> number of dirty objects on the system and create more candidates to be
> dropped.
>
> +By default, pages which are currently mapped are not dropped from the
> +pagecache. If you want to unmap and drop these pages too, echo 9 or 11 instead
> +of 1 or 3 respectively (set bit 4).

Typically we number bits from 0, so this would be bit 3, not 4. I do see
elsewhere in this file somebody else got this wrong:

: with your system. To disable them, echo 4 (bit 3) into drop_caches.

but that should also be fixed.

> +static int __invalidate_inode_page(struct page *page, bool unmap)
> +{
> + struct address_space *mapping = page_mapping(page);
> + if (!mapping)
> + return 0;
> + if (PageDirty(page) || PageWriteback(page))
> + return 0;
> + if (page_mapped(page)) {
> + if (!unmap)
> + return 0;
> + if (!try_to_unmap(page, TTU_IGNORE_ACCESS))
> + return 0;

You're going to get data corruption doing this. try_to_unmap_one() does:

/* Move the dirty bit to the page. Now the pte is gone. */
if (pte_dirty(pteval))
set_page_dirty(page);

so PageDirty() can be false above, but made true by calling try_to_unmap().

I also think the way you've done this is expedient at the cost of
efficiency and layering violations. I think you should first tear
down the mappings of userspace processes (which will reclaim a lot
of pages allocated to page tables), then you won't need to touch the
invalidate_inode_pages paths at all.