Re: RCU latency regression in 2.6.16-rc1

From: Dipankar Sarma
Date: Sat Jan 28 2006 - 12:01:48 EST


On Fri, Jan 27, 2006 at 01:55:22PM -0500, Lee Revell wrote:
> On Thu, 2006-01-26 at 11:18 -0800, Paul E. McKenney wrote:
> > > Xorg-2154 0d.s. 213us : call_rcu_bh (rt_run_flush)
> > > Xorg-2154 0d.s. 215us : local_bh_enable (rt_run_flush)
> > > Xorg-2154 0d.s. 222us : local_bh_enable (rt_run_flush)
> > > Xorg-2154 0d.s. 223us : call_rcu_bh (rt_run_flush)
> > >
> > > [ zillions of these deleted ]
> > >
> > > Xorg-2154 0d.s. 7335us : local_bh_enable (rt_run_flush)
> >
> > Dipankar's latest patch should hopefully address this problem.
> >
> > Could you please give it a spin when you get a chance?
>
> Nope, no improvement at all, furthermore, the machine locked up once
> under heavy disk activity.
>
> I just got an 8ms+ latency from rt_run_flush that looks basically
> identical to the above. It's still flushing routes in huge batches:

I am not supprised that the earlier patch doesn't help your
test. Once you reach the high watermark, the "desperation mode"
latency can be fairly bad since the RCU batch size is pretty
big.

How about trying out the patch included below ? It doesn't reduce
amount of work done from softirq context, but decreases the
*number of RCUs* generated during rt_run_flush() by using
one RCU per hash chain. Can you check if this makes any
difference ?

Thanks
Dipankar


Reduce the number of RCU callbacks by flushing one hash chain
at a time. This is intended to reduce RCU overhead during
frequent flushing.

Signed-off-by: Dipankar Sarma <dipankar@xxxxxxxxxx>
---


net/ipv4/route.c | 18 +++++++++++++-----
1 files changed, 13 insertions(+), 5 deletions(-)

diff -puN net/ipv4/route.c~rcu-rt-flush-list net/ipv4/route.c
--- linux-2.6.16-rc1-rcu/net/ipv4/route.c~rcu-rt-flush-list 2006-01-28 20:34:10.000000000 +0530
+++ linux-2.6.16-rc1-rcu-dipankar/net/ipv4/route.c 2006-01-28 21:30:27.000000000 +0530
@@ -670,13 +670,23 @@ static void rt_check_expire(unsigned lon
mod_timer(&rt_periodic_timer, jiffies + ip_rt_gc_interval);
}

+static void rt_list_free(struct rcu_head *head)
+{
+ struct rtable *next, *rth;
+ rth = container_of(head, struct rtable, u.dst.rcu_head);
+ for (; rth; rth = next) {
+ next = rth->u.rt_next;
+ dst_free(&rth->u.dst);
+ }
+}
+
/* This can run from both BH and non-BH contexts, the latter
* in the case of a forced flush event.
*/
static void rt_run_flush(unsigned long dummy)
{
int i;
- struct rtable *rth, *next;
+ struct rtable *rth;

rt_deadline = 0;

@@ -689,10 +699,8 @@ static void rt_run_flush(unsigned long d
rt_hash_table[i].chain = NULL;
spin_unlock_bh(rt_hash_lock_addr(i));

- for (; rth; rth = next) {
- next = rth->u.rt_next;
- rt_free(rth);
- }
+ if (rth)
+ call_rcu_bh(&rth->u.dst.rcu_head, rt_list_free);
}
}


_
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/