Re: [PATCH v10 3/3] mm: fix double page fault on arm64 if PTE_AF is cleared

From: Will Deacon
Date: Tue Oct 01 2019 - 08:54:22 EST


On Mon, Sep 30, 2019 at 09:57:40AM +0800, Jia He wrote:
> When we tested pmdk unit test [1] vmmalloc_fork TEST1 in arm64 guest, there
> will be a double page fault in __copy_from_user_inatomic of cow_user_page.
>
> Below call trace is from arm64 do_page_fault for debugging purpose
> [ 110.016195] Call trace:
> [ 110.016826] do_page_fault+0x5a4/0x690
> [ 110.017812] do_mem_abort+0x50/0xb0
> [ 110.018726] el1_da+0x20/0xc4
> [ 110.019492] __arch_copy_from_user+0x180/0x280
> [ 110.020646] do_wp_page+0xb0/0x860
> [ 110.021517] __handle_mm_fault+0x994/0x1338
> [ 110.022606] handle_mm_fault+0xe8/0x180
> [ 110.023584] do_page_fault+0x240/0x690
> [ 110.024535] do_mem_abort+0x50/0xb0
> [ 110.025423] el0_da+0x20/0x24
>
> The pte info before __copy_from_user_inatomic is (PTE_AF is cleared):
> [ffff9b007000] pgd=000000023d4f8003, pud=000000023da9b003, pmd=000000023d4b3003, pte=360000298607bd3
>
> As told by Catalin: "On arm64 without hardware Access Flag, copying from
> user will fail because the pte is old and cannot be marked young. So we
> always end up with zeroed page after fork() + CoW for pfn mappings. we
> don't always have a hardware-managed access flag on arm64."
>
> This patch fix it by calling pte_mkyoung. Also, the parameter is
> changed because vmf should be passed to cow_user_page()
>
> Add a WARN_ON_ONCE when __copy_from_user_inatomic() returns error
> in case there can be some obscure use-case.(by Kirill)
>
> [1] https://github.com/pmem/pmdk/tree/master/src/test/vmmalloc_fork
>
> Signed-off-by: Jia He <justin.he@xxxxxxx>
> Reported-by: Yibo Cai <Yibo.Cai@xxxxxxx>
> Reviewed-by: Catalin Marinas <catalin.marinas@xxxxxxx>
> Acked-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
> ---
> mm/memory.c | 99 +++++++++++++++++++++++++++++++++++++++++++++--------
> 1 file changed, 84 insertions(+), 15 deletions(-)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index b1ca51a079f2..1f56b0118ef5 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -118,6 +118,13 @@ int randomize_va_space __read_mostly =
> 2;
> #endif
>
> +#ifndef arch_faults_on_old_pte
> +static inline bool arch_faults_on_old_pte(void)
> +{
> + return false;
> +}
> +#endif

Kirill has acked this, so I'm happy to take the patch as-is, however isn't
it the case that /most/ architectures will want to return true for
arch_faults_on_old_pte()? In which case, wouldn't it make more sense for
that to be the default, and have x86 and arm64 provide an override? For
example, aren't most architectures still going to hit the double fault
scenario even with your patch applied?

Will