Re: [PATCHi v2] mm: do not drop unused pages when userfaultd is running

From: Andrew Morton
Date: Mon Jul 02 2018 - 17:06:46 EST


On Mon, 2 Jul 2018 09:50:49 +0200 Christian Borntraeger <borntraeger@xxxxxxxxxx> wrote:

> KVM guests on s390 can notify the host of unused pages. This can result
> in pte_unused callbacks to be true for KVM guest memory.
>
> If a page is unused (checked with pte_unused) we might drop this page
> instead of paging it. This can have side-effects on userfaultd, when the
> page in question was already migrated:
>
> The next access of that page will trigger a fault and a user fault
> instead of faulting in a new and empty zero page. As QEMU does not
> expect a userfault on an already migrated page this migration will fail.
>
> The most straightforward solution is to ignore the pte_unused hint if a
> userfault context is active for this VMA.
>
> ...
>
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -64,6 +64,7 @@
> #include <linux/backing-dev.h>
> #include <linux/page_idle.h>
> #include <linux/memremap.h>
> +#include <linux/userfaultfd_k.h>
>
> #include <asm/tlbflush.h>
>
> @@ -1481,7 +1482,7 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
> set_pte_at(mm, address, pvmw.pte, pteval);
> }
>
> - } else if (pte_unused(pteval)) {
> + } else if (pte_unused(pteval) && !userfaultfd_armed(vma)) {
> /*
> * The guest indicated that the page content is of no
> * interest anymore. Simply discard the pte, vmscan

A reader of this code will wonder why we're checking
userfaultfd_armed(). So the writer of this code should add a comment
which explains this to them ;) Please.