Re: [PATCH v4 04/12] riscv: Only send remote fences when some other CPU is online

From: Samuel Holland
Date: Thu Jan 04 2024 - 10:33:11 EST


Hi Alex,

On 2024-01-04 6:33 AM, Alexandre Ghiti wrote:
> On Tue, Jan 2, 2024 at 11:01 PM Samuel Holland
> <samuel.holland@xxxxxxxxxx> wrote:
>>
>> If no other CPU is online, a local cache or TLB flush is sufficient.
>> These checks can be constant-folded when SMP is disabled.
>>
>> Signed-off-by: Samuel Holland <samuel.holland@xxxxxxxxxx>
>> ---
>>
>> Changes in v4:
>> - New patch for v4
>>
>> arch/riscv/mm/cacheflush.c | 4 +++-
>> arch/riscv/mm/tlbflush.c | 4 +++-
>> 2 files changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
>> index 47c485bc7df0..f7933ae88a55 100644
>> --- a/arch/riscv/mm/cacheflush.c
>> +++ b/arch/riscv/mm/cacheflush.c
>> @@ -21,7 +21,9 @@ void flush_icache_all(void)
>> {
>> local_flush_icache_all();
>>
>> - if (riscv_use_sbi_for_rfence())
>> + if (num_online_cpus() < 2)
>> + return;
>> + else if (riscv_use_sbi_for_rfence())
>> sbi_remote_fence_i(NULL);
>> else
>> on_each_cpu(ipi_remote_fence_i, NULL, 1);
>> diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c
>> index 2f18fe6fc4f3..37b3c93e3c30 100644
>> --- a/arch/riscv/mm/tlbflush.c
>> +++ b/arch/riscv/mm/tlbflush.c
>> @@ -73,7 +73,9 @@ static void __ipi_flush_tlb_all(void *info)
>>
>> void flush_tlb_all(void)
>> {
>> - if (riscv_use_sbi_for_rfence())
>> + if (num_online_cpus() < 2)
>> + local_flush_tlb_all();
>> + else if (riscv_use_sbi_for_rfence())
>> sbi_remote_sfence_vma_asid(NULL, 0, FLUSH_TLB_MAX_SIZE, FLUSH_TLB_NO_ASID);
>> else
>> on_each_cpu(__ipi_flush_tlb_all, NULL, 1);
>> --
>> 2.42.0
>>
>
> on_each_cpu() already deals correctly with a single online cpu, the
> only thing to optimize here is the SBI rfence. So I'd move this new
> test in sbi_remote_sfence_vma_asid() and sbi_remote_fence_i() to avoid
> the superfluous M-mode entry when only one cpu is online by checking
> the cpumask. And since sbi_remote_fence_i() is used in another

What specific cpumask check are you suggesting? In sbi_remote_sfence_vma_asid()
I don't think we can assume the local cpu is always included in the cpumask
(which we _can_ assume here), so it would need to construct/compare the whole
bitmap. That's much more expensive than the atomic load here.

> function (flush_icache_mm()), we could also take advantage of this
> optimization when only the local cpu must be flushed.

flush_icache_mm() already has a "local" variable which it uses to skip the call
to sbi_remote_fence_i(). Same with the broadcast check in __flush_tlb_range().
So no additional check is needed there. Those two functions, plus the two
changed in this patch, are the only call sites of the sbi_*() functions. I think
it makes more sense to optimize the four call sites than adding an additional
check in the sbi_*() functions.

Regards,
Samuel