Re: [PATCH] x86: Reduce the default HZ value

From: Paul E. McKenney
Date: Fri May 08 2009 - 11:12:57 EST

Next message: George Spelvin: "Re: [RFC][PATCH] tsc_khz= boot option to avoid TSC calibration variance"
Previous message: Davide Libenzi: "Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notificationinterface"
In reply to: Christoph Lameter: "Re: [PATCH] x86: Reduce the default HZ value"
Next in thread: devzero: "Re: [PATCH] x86: Reduce the default HZ value"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Fri, May 08, 2009 at 10:16:10AM -0400, Christoph Lameter wrote:
> On Fri, 8 May 2009, Paul E. McKenney wrote:
>
> > > Can't you simply enter idle state after a grace period completes and
> > > finds no pending callbacks for the next period. And leave idle state at
> > > the next call_rcu()?
> >
> > If there were no RCU callbacks -globally- across all CPUs, yes. But
> > the check at the end of rcu_irq_exit() is testing only on the current
> > CPU. Checking across all CPUs is expensive and racy.
> >
> > So what happens instead is that there is rcu_needs_cpu(), which gates
> > entry into dynticks-idle mode. This function returns 1 if there are
> > callbacks on the current CPU. So, if no CPU has an RCU callback, then
> > all CPUs can enter dynticks-idle mode so that the entire system is
> > quiescent from an RCU viewpoint -- no RCU processing at all.
>
> Did not follow RCU developments. But wasnt there a time when RCU periods
> were processor specific and a global RCU period was done when all the
> processors went through their rcu periods?

For non-realtime RCU implementations, after a given grace period starts,
once each CPU goes through a "quiescent state", then that grace period
can end. For realtime (AKA "preemptable") RCU, the focus is on tasks
rather than CPUs, but the same general principle applies, give or take
some implementation details: after a given grace period starts, once
each task goes through a quiescent state, then that grace period can end.

> Cpu cache hotness may not be relevant to RCU since rcu involves long time
> periods in which cachelines cool down. Can the RCU callbacks all be done
> on processor 0 (or a so designated processor)? That would avoiding
> disturbances of the other processors.

This approach -might- be OK for a specially configured and protected HPC
machine. But it is a non-starter for general-purpose machines. For an
example of why, consider a denial-of-service attack that continually
change routing tables could saturate CPU 0 and start falling behind,
eventually OOMing the machine.

But if you would like to experiment with this, make call_rcu() be a
wrapper that causes an underlying call_rcu_cpu_0() to be executed on
CPU 0. That won't get exactly the cache-warmth effects that you are
after, but it will let you see whether the machine would gracefully
handle various events that might dump large numbers of callbacks.

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: George Spelvin: "Re: [RFC][PATCH] tsc_khz= boot option to avoid TSC calibration variance"
Previous message: Davide Libenzi: "Re: [KVM PATCH v4 2/2] kvm: add support for irqfd via eventfd-notificationinterface"
In reply to: Christoph Lameter: "Re: [PATCH] x86: Reduce the default HZ value"
Next in thread: devzero: "Re: [PATCH] x86: Reduce the default HZ value"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]