Re: [PATCH 3/3] mm, thp: Do not loose dirty bit in __split_huge_pmd_locked()

From: Martin Schwidefsky
Date: Wed Jun 14 2017 - 10:19:16 EST


On Wed, 14 Jun 2017 16:51:43 +0300
"Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx> wrote:

> Until pmdp_invalidate() pmd entry is present and CPU can update it,
> setting dirty. Currently, we tranfer dirty bit to page too early and
> there is window when we can miss dirty bit.
>
> Let's call SetPageDirty() after pmdp_invalidate().
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
> ...
> @@ -2046,6 +2043,14 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
> * pmd_populate.
> */
> pmdp_invalidate(vma, haddr, pmd);
> +
> + /*
> + * Transfer dirty bit to page after pmd invalidated, so CPU would not
> + * be able to set it under us.
> + */
> + if (pmd_dirty(*pmd))
> + SetPageDirty(page);
> +
> pmd_populate(mm, pmd, pgtable);
>
> if (freeze) {

That won't work on s390. After pmdp_invalidate the pmd entry is gone,
it has been replaced with _SEGMENT_ENTRY_EMPTY. This includes the
dirty and referenced bits. The old scheme is

entry = *pmd;
pmdp_invalidate(vma, addr, pmd);
if (pmd_dirty(entry))
...

Could we change pmdp_invalidate to make it return the old pmd entry?
The pmdp_xchg_direct function already returns it, for s390 that would
be an easy change. The above code snippet would change like this:

entry = pmdp_invalidate(vma, addr, pmd);
if (pmd_dirty(entry))
...

--
blue skies,
Martin.

"Reality continues to ruin my life." - Calvin.