Re: [PATCH v2 4/4] hugetlb: Do early cow when page pinned on src mm

From: Mike Kravetz
Date: Fri Feb 05 2021 - 00:12:43 EST


On 2/4/21 5:43 PM, Peter Xu wrote:
> On Thu, Feb 04, 2021 at 03:25:37PM -0800, Mike Kravetz wrote:
>> On 2/4/21 6:50 AM, Peter Xu wrote:
>>> This is the last missing piece of the COW-during-fork effort when there're
>>> pinned pages found. One can reference 70e806e4e645 ("mm: Do early cow for
>>> pinned pages during fork() for ptes", 2020-09-27) for more information, since
>>> we do similar things here rather than pte this time, but just for hugetlb.
>>>
>>> Signed-off-by: Peter Xu <peterx@xxxxxxxxxx>
>>> ---
>>> mm/hugetlb.c | 61 +++++++++++++++++++++++++++++++++++++++++++++++-----
>>> 1 file changed, 56 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>>> index 9e6ea96bf33b..5793936e00ef 100644
>>> --- a/mm/hugetlb.c
>>> +++ b/mm/hugetlb.c
>>> + __SetPageUptodate(new_page);
>>> + ClearPagePrivate(new_page);
>>> + set_page_huge_active(new_page);
>>
>> Code to replace the above ClearPagePrivate and set_page_huge_active is
>> in Andrew's tree. With changes in Andrew's tree, this would be:
>>
>> ClearHPageRestoreReserve(new_page);
>> SetHPageMigratable(new_page);
>
> Indeed these names are much better than using the default ones. At the
> meantime I'll rebase to linux-next/akpm. Sorry it's always not easy for me to
> find the right branch...

No worries. I only know because I recently changed these.

...
>>> @@ -3787,7 +3803,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
>>> dst_entry = huge_ptep_get(dst_pte);
>>> if ((dst_pte == src_pte) || !huge_pte_none(dst_entry))
>>> continue;
>>> -
>>> +again:
>>> dst_ptl = huge_pte_lock(h, dst, dst_pte);
>>> src_ptl = huge_pte_lockptr(h, src, src_pte);
>>> spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);
>
> Side question: Mike, do you know why we need this lock_nested()? Could the src
> lock be taken due to any reason already? It confused me when I read the chunk.

I see that it was added with commit 4647875819aa. That was when huge pages
used the single per-mm ptl. Lockdep seemed to complain about taking
&mm->page_table_lock twice. Certainly, source and destination mm can not
be the same. Right? I do not have the full history, but it 'looks' like
lockdep might have been confused and this was added to keep it quiet.

BTW - Copy page range for 'normal' pages has the same spin_lock_nested().
--
Mike Kravetz