Re: x86: Is there still value in having a special tlb flush IPI vector?

From: Nick Piggin
Date: Tue Jul 29 2008 - 05:48:22 EST


On Tuesday 29 July 2008 16:19, Jeremy Fitzhardinge wrote:
> Nick Piggin wrote:
> > It definitely is not a clear win. They do not have the same
> > characteristics. So numbers will be needed.
>
> > smp_call_function is now properly scalable in smp_call_function_single
> > form. The more general case of multiple targets is not so easy and it
> > still takes a global lock and touches global cachelines.
> >
> > I don't think it is a good use of time, honestly. Do you have a good
> > reason?
>
> Code cleanup, unification. It took about 20 minutes to do. It probably

OK, so nothing terribly important.


> won't take too much longer to unify kernel/tlb.c. It seems that if
> there's any performance loss in making the transition, then we can make
> it up again by tuning smp_call_function_mask, benefiting all users.

No I don't think that is the right way to go for such an important
functionality. There are no ifs, smp_call_function does touch global
cachelines and locks.

smp_call_function is barely used, as should be very obvious because it
was allowed to languish with such horrible performance for so long. So
there aren't too many users.

But if you get smp_call_function_mask performance at the same time,
then there is less to argue about I guess (although it will always
be necessarily more complex than plain tlb flushing).


> But, truth be told, the real reason is that I think there may be some
> correctness issue around smp_call_function* - I've seen occasional
> inexplicable crashes, all within generic_smp_call_function() - and I
> just can't exercise that code enough to get a solid reproducing case.
> But if it gets used for tlb flushes, then any bug is going to become
> pretty obvious. Regardless of whether these patches get accepted, I can
> use it as a test vehicle.

That's fair enough. Better still might be a test harness specifically
to exercise it.


> > No. The rewrite makes it now very good at synchronously sending a
> > function to a single other CPU.
> >
> > Sending asynchronously requires a slab allocation and then a remote slab
> > free (which is nasty for slab) at the other end, and bouncing of locks
> > and cachelines. No way you want to do that in the reschedule IPI.
> >
> > Not to mention the minor problem that it still deadlocks when called with
> > interrupts disabled ;)
>
> In the async case? Or because it can become spontaneously sync if
> there's an allocation failure?

In both sync and async case, yes.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/