Re: [PATCH v2 04/10] mm/hugetlb: Move swap entry handling into vma lock when faulted

From: Peter Xu
Date: Wed Dec 07 2022 - 17:44:16 EST


Hi, John,

Firstly, thanks for taking a look at the whole set.

On Wed, Dec 07, 2022 at 02:36:21PM -0800, John Hubbard wrote:
> > @@ -5886,8 +5866,22 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> > * fault, and is_hugetlb_entry_(migration|hwpoisoned) check will
> > * properly handle it.
> > */
> > - if (!pte_present(entry))
> > + if (!pte_present(entry)) {
> > + if (unlikely(is_hugetlb_entry_migration(entry))) {
> > + /*
> > + * Release fault lock first because the vma lock is
> > + * needed to guard the huge_pte_lockptr() later in
> > + * migration_entry_wait_huge(). The vma lock will
> > + * be released there.
> > + */
> > + mutex_unlock(&hugetlb_fault_mutex_table[hash]);
> > + migration_entry_wait_huge(vma, ptep);
> > + return 0;
>
> Oh, but now (and also one other, pre-existing case, above)
> hugetlb_fault() is returning with the vma lock held.

Note that here migration_entry_wait_huge() will release it.

Sorry it's definitely not as straightforward, but this is also something I
didn't come up with a better solution, because we need the vma lock to
protect the spinlock, which is used in deep code path of the migration
code.

That's also why I added a rich comment above, and there's "The vma lock
will be released there" which is just for that.

> This is in contrast
> with most of the rest of the function, which takes great care to release
> locks before returning.
>
> Which makes this new case really quite irregular and makes the overall
> locking harder to follow. It would be ideal to avoid doing this! But at
> the very least, there should be a little note above hugetlb_fault(),
> explaining this deviation and how it fits in with the locking rules.
>
> Do we really have to structure it this way, though?

--
Peter Xu