Re: [PATCH 0/5] Generic smp_call_function(), improvements, and smp_call_function_single()

From: Nick Piggin
Date: Tue Mar 25 2008 - 04:00:49 EST

Next message: Thomas Gleixner: "Re: 2.6.25-rc5-git6: Reported regressions from 2.6.24"
Previous message: Andi Kleen: "Re: [PATCH prototype] [0/8] Predictive bitmaps for ELF executables"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Saturday 22 March 2008 00:15, Jens Axboe wrote:
> On Fri, Mar 21 2008, Ingo Molnar wrote:
> > * Jens Axboe <jens.axboe@xxxxxxxxxx> wrote:
> > > The patch series is also available in the 'generic-ipi' branch from
> > >
> > > git://git.kernel.dk/linux-2.6-block.git
> > >
> > > and the 'io-cpu-affinity' branch is directly based on this.
> >
> > i'm still wondering about the following fundamental bit: why not use
> > per CPU kernel threads? That way you get a fast (lockless) IPI "for
> > free" as SMP wakeups already do this.
>
> The kernel thread variant wont be any more lockless than the
> smp_call_function_single() approach, they both have to grab the
> destination queue lock. If you recall, I pushed forward on the kernel
> thread variant and even still have it online here:
>
> http://git.kernel.dk/?p=linux-2.6-block.git;a=shortlog;h=io-cpu-affinity-kt
>hread
>
> which is pretty much identical to io-cpu-affinity, except it uses kernel
> threads for completion.
>
> The reason why I dropped the kthread approach is that it was slower.
> Time from signal to run was about 33% faster with IPI than with
> wake_up_process(). Doing benchmark runs, and the IPI approach won hands
> down in cache misses as well.

It's obviously going to be faster. The kthread approach needs an IPI
*and* a context switch on the destination CPU.

For broadcast situations, it is also going to be faster, because it
can use platform specific broadcast IPIs

The only weird atomicity requirements of smp_call_function is due to
its terribly unscalable design. Once you have the producer provide its
own data (or accept -ENOMEM failures), then there is no problem. It's
then even possible to use it from interrupt context with not too much
more work.

> > smp_call_function() is quirky and has deep limitations on atomicity,
> > etc., so we are moving away from it and should not base more
> > functionality on it.
>
> The patchset does not build on smp_call_function(), it merely cleans
> that stuff up instead of having essentially the same code in each arch.
> As more archs are converted, it'll remove lots more code.

Also there is nothing wrong with improving the speed of the existing
implementation. The current smp_call_function is more or less unusable
for any real work because it is totally serialised.

Thanks for taking this patchset under your wing, Jens.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Thomas Gleixner: "Re: 2.6.25-rc5-git6: Reported regressions from 2.6.24"
Previous message: Andi Kleen: "Re: [PATCH prototype] [0/8] Predictive bitmaps for ELF executables"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]