Re: [PATCH v2 2/2] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to MM CPUs

From: Peter Zijlstra
Date: Mon Jul 03 2023 - 09:58:26 EST


On Tue, Jun 20, 2023 at 05:46:18PM +0300, Yair Podemsky wrote:
> @@ -191,7 +191,13 @@ static void tlb_remove_table_smp_sync(void *arg)
> /* Simply deliver the interrupt */
> }
>
> -void tlb_remove_table_sync_one(void)
> +#ifdef CONFIG_ARCH_HAS_CPUMASK_BITS
> +#define REMOVE_TABLE_IPI_MASK mm_cpumask(mm)
> +#else
> +#define REMOVE_TABLE_IPI_MASK cpu_online_mask
> +#endif /* CONFIG_ARCH_HAS_CPUMASK_BITS */
> +
> +void tlb_remove_table_sync_one(struct mm_struct *mm)
> {
> /*
> * This isn't an RCU grace period and hence the page-tables cannot be
> @@ -200,7 +206,8 @@ void tlb_remove_table_sync_one(void)
> * It is however sufficient for software page-table walkers that rely on
> * IRQ disabling.
> */
> - smp_call_function(tlb_remove_table_smp_sync, NULL, 1);
> + on_each_cpu_mask(REMOVE_TABLE_IPI_MASK, tlb_remove_table_smp_sync,
> + NULL, true);

Aside from what Dave said about the REMOVE_TABLE_IPI_MASK thing, this
isn't right.

on_each_cpu_mask() includes the current cpu, while smp_call_function()
explicitly does not.

Yes, they all end up in smp_call_function_many_cond(), but the
on_each_cpu*() family will have SCF_RUN_LOCAL set, while the
smp_call_function*() family will not.