Re: [PATCH v2 14/14] arm64/mm: Add ptep_get_and_clear_full() to optimize process teardown

From: Ryan Roberts
Date: Tue Nov 28 2023 - 06:15:28 EST


On 28/11/2023 07:32, Barry Song wrote:
>> +#define __HAVE_ARCH_PTEP_GET_AND_CLEAR_FULL
>> +static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm,
>> + unsigned long addr, pte_t *ptep, int full)
>> +{
>> + pte_t orig_pte = __ptep_get(ptep);
>> +
>> + if (!pte_valid_cont(orig_pte) || !full) {
>> + contpte_try_unfold(mm, addr, ptep, orig_pte);
>> + return __ptep_get_and_clear(mm, addr, ptep);
>> + } else
>> + return contpte_ptep_get_and_clear_full(mm, addr, ptep);
>> +}
>> +
>
> Hi Ryan,
>
> I feel quite hard to understand the code. when !pte_valid_cont(orig_pte),
> we will call contpte_try_unfold(mm, addr, ptep, orig_pte);
>
> but in contpte_try_unfold(), we call unfold only if pte_valid_cont()
> is true:
> static inline void contpte_try_unfold(struct mm_struct *mm, unsigned long addr,
> pte_t *ptep, pte_t pte)
> {
> if (contpte_is_enabled(mm) && pte_valid_cont(pte))
> __contpte_try_unfold(mm, addr, ptep, pte);
> }
>
> so do you mean the below?
>
> if (!pte_valid_cont(orig_pte))
> return __ptep_get_and_clear(mm, addr, ptep);
>
> if (!full) {
> contpte_try_unfold(mm, addr, ptep, orig_pte);
> return __ptep_get_and_clear(mm, addr, ptep);
> } else {
> return contpte_ptep_get_and_clear_full(mm, addr, ptep);
> }

Yes, this is equivalent. In general, I was trying not to spray `if
(pte_valid_cont(orig_pte))` checks everywhere to guard contpte_try_unfold() and
instead put the checks into contpte_try_unfold() (hence the 'try'). I figured
just calling it unconditionally and letting the compiler optimize as it sees fit
was the cleanest approach.

But in this instance I can see this is confusing. I'll modify as you suggest.
Thanks!

>
> Thanks
> Barry
>
>