Re: [PATCH v2 1/2] mm/migrate_device.c: Copy pte dirty bit to page

From: Alistair Popple
Date: Tue Aug 16 2022 - 21:54:47 EST



Peter Xu <peterx@xxxxxxxxxx> writes:

> On Tue, Aug 16, 2022 at 04:10:29PM +0800, huang ying wrote:
>> > @@ -193,11 +194,10 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
>> > bool anon_exclusive;
>> > pte_t swp_pte;
>> >
>> > + flush_cache_page(vma, addr, pte_pfn(*ptep));
>> > + pte = ptep_clear_flush(vma, addr, ptep);
>>
>> Although I think it's possible to batch the TLB flushing just before
>> unlocking PTL. The current code looks correct.
>
> If we're with unconditionally ptep_clear_flush(), does it mean we should
> probably drop the "unmapped" and the last flush_tlb_range() already since
> they'll be redundant?

This patch does that, unless I missed something?

> If that'll need to be dropped, it looks indeed better to still keep the
> batch to me but just move it earlier (before unlock iiuc then it'll be
> safe), then we can keep using ptep_get_and_clear() afaiu but keep "pte"
> updated.

I think we would also need to check should_defer_flush(). Looking at
try_to_unmap_one() there is this comment:

if (should_defer_flush(mm, flags) && !anon_exclusive) {
/*
* We clear the PTE but do not flush so potentially
* a remote CPU could still be writing to the folio.
* If the entry was previously clean then the
* architecture must guarantee that a clear->dirty
* transition on a cached TLB entry is written through
* and traps if the PTE is unmapped.
*/

And as I understand it we'd need the same guarantee here. Given
try_to_migrate_one() doesn't do batched TLB flushes either I'd rather
keep the code as consistent as possible between
migrate_vma_collect_pmd() and try_to_migrate_one(). I could look at
introducing TLB flushing for both in some future patch series.

- Alistair

> Thanks,