Re: [PATCH 2/3] mm: drop VMA lock before waiting for migration

From: Alistair Popple
Date: Wed May 03 2023 - 09:05:55 EST



Suren Baghdasaryan <surenb@xxxxxxxxxx> writes:

> On Tue, May 2, 2023 at 6:26 AM 'Alistair Popple' via kernel-team
> <kernel-team@xxxxxxxxxxx> wrote:
>>
>>
>> Suren Baghdasaryan <surenb@xxxxxxxxxx> writes:
>>
>> [...]
>>
>> > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
>> > index 306a3d1a0fa6..b3b57c6da0e1 100644
>> > --- a/include/linux/mm_types.h
>> > +++ b/include/linux/mm_types.h
>> > @@ -1030,6 +1030,7 @@ typedef __bitwise unsigned int vm_fault_t;
>> > * fsync() to complete (for synchronous page faults
>> > * in DAX)
>> > * @VM_FAULT_COMPLETED: ->fault completed, meanwhile mmap lock released
>> > + * @VM_FAULT_VMA_UNLOCKED: VMA lock was released
>>
>> A note here saying vmf->vma should no longer be accessed would be nice.
>
> Good idea. Will add in the next version. Thanks!
>
>>
>> > * @VM_FAULT_HINDEX_MASK: mask HINDEX value
>> > *
>> > */
>> > @@ -1047,6 +1048,7 @@ enum vm_fault_reason {
>> > VM_FAULT_DONE_COW = (__force vm_fault_t)0x001000,
>> > VM_FAULT_NEEDDSYNC = (__force vm_fault_t)0x002000,
>> > VM_FAULT_COMPLETED = (__force vm_fault_t)0x004000,
>> > + VM_FAULT_VMA_UNLOCKED = (__force vm_fault_t)0x008000,
>> > VM_FAULT_HINDEX_MASK = (__force vm_fault_t)0x0f0000,
>> > };
>> >
>> > @@ -1070,7 +1072,9 @@ enum vm_fault_reason {
>> > { VM_FAULT_RETRY, "RETRY" }, \
>> > { VM_FAULT_FALLBACK, "FALLBACK" }, \
>> > { VM_FAULT_DONE_COW, "DONE_COW" }, \
>> > - { VM_FAULT_NEEDDSYNC, "NEEDDSYNC" }
>> > + { VM_FAULT_NEEDDSYNC, "NEEDDSYNC" }, \
>> > + { VM_FAULT_COMPLETED, "COMPLETED" }, \
>>
>> VM_FAULT_COMPLETED isn't used in this patch, guessing that's snuck in
>> from one of the other patches in the series?
>
> I noticed that an entry for VM_FAULT_COMPLETED was missing and wanted
> to fix that... Should I drop that?

Oh ok. It would certainly be good to add but really it should be it's
own patch.

>>
>> > + { VM_FAULT_VMA_UNLOCKED, "VMA_UNLOCKED" }
>> >
>> > struct vm_special_mapping {
>> > const char *name; /* The name, e.g. "[vdso]". */
>> > diff --git a/mm/memory.c b/mm/memory.c
>> > index 41f45819a923..8222acf74fd3 100644
>> > --- a/mm/memory.c
>> > +++ b/mm/memory.c
>> > @@ -3714,8 +3714,16 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
>> > entry = pte_to_swp_entry(vmf->orig_pte);
>> > if (unlikely(non_swap_entry(entry))) {
>> > if (is_migration_entry(entry)) {
>> > - migration_entry_wait(vma->vm_mm, vmf->pmd,
>> > - vmf->address);
>> > + /* Save mm in case VMA lock is dropped */
>> > + struct mm_struct *mm = vma->vm_mm;
>> > +
>> > + if (vmf->flags & FAULT_FLAG_VMA_LOCK) {
>> > + /* No need to hold VMA lock for migration */
>> > + vma_end_read(vma);
>> > + /* CAUTION! VMA can't be used after this */
>> > + ret |= VM_FAULT_VMA_UNLOCKED;
>> > + }
>> > + migration_entry_wait(mm, vmf->pmd, vmf->address);
>> > } else if (is_device_exclusive_entry(entry)) {
>> > vmf->page = pfn_swap_entry_to_page(entry);
>> > ret = remove_device_exclusive_entry(vmf);
>>
>> --
>> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@xxxxxxxxxxx.
>>