Re: [PATCH 10/12] hugetlb: batch PMD split for bulk vmemmap dedup

From: Muchun Song
Date: Wed Aug 30 2023 - 23:55:10 EST




> On Aug 31, 2023, at 00:03, Joao Martins <joao.m.martins@xxxxxxxxxx> wrote:
>
> On 30/08/2023 12:13, Joao Martins wrote:
>> On 30/08/2023 09:09, Muchun Song wrote:
>>> On 2023/8/26 03:04, Mike Kravetz wrote:
>>>> +
>>>> + /*
>>>> + * We are only splitting, not remapping the hugetlb vmemmap
>>>> + * pages.
>>>> + */
>>>> + if (bulk)
>>>> + continue;
>>>
>>> Actually, we don not need a flag to detect this situation, you could
>>> use "!@walk->remap_pte" to determine whether we should go into the
>>> next level traversal of the page table. ->remap_pte is used to traverse
>>> the pte entry, so it make senses to continue to the next pmd entry if
>>> it is NULL.
>>>
>>
>> Yeap, great suggestion.
>>
>>>> +
>>>> vmemmap_pte_range(pmd, addr, next, walk);
>>>> } while (pmd++, addr = next, addr != end);
>>>> @@ -197,7 +211,8 @@ static int vmemmap_remap_range(unsigned long start,
>>>> unsigned long end,
>>>> return ret;
>>>> } while (pgd++, addr = next, addr != end);
>>>> - flush_tlb_kernel_range(start, end);
>>>> + if (!(walk->flags & VMEMMAP_REMAP_ONLY_SPLIT))
>>>> + flush_tlb_kernel_range(start, end);
>>>
>>> This could be:
>>>
>>> if (walk->remap_pte)
>>> flush_tlb_kernel_range(start, end);
>>>
>> Yeap.
>>
>
> Quite correction: This stays as is, except with a flag rename. That is because
> this is actual flush that we intend to batch in the next patch. And while the
> PMD split could just use !walk->remap_pte, the next patch would just need to
> test NO_TLB_FLUSH flag. Meaning we endup anyways just testing for this
> to-be-consolidated flag

I think this really should be "if (walk->remap_pte && !(flag & VMEMMAP_NO_TLB_FLUSH))"
in your next patch. This TLB flushing only make sense for the case of existing of
@walk->remap_pte. I know "if (!(flag & VMEMMAP_NO_TLB_FLUSH))" check is suitable for your
use case, but what if a user (even if it does not exist now, but it may in the future)
passing a NULL @walk->remap_pte and not specifying VMEMMAP_NO_TLB_FLUSH? Then we will
do a useless TLB flushing. This is why I suggest you change this to "if (walk->remap_pte)"
in this patch and change it to "if (walk->remap_pte && !(flag & VMEMMAP_NO_TLB_FLUSH))"
in the next patch.

Thanks.