Re: [patch] latency tracer, 2.6.15-rc7

From: Linus Torvalds
Date: Fri Dec 30 2005 - 20:39:56 EST




On Fri, 30 Dec 2005, Lee Revell wrote:
>
> No there are no large jumps, it really seems that this was the network
> code causing an RCU callback to drop ~2K routes at once. Specifically
> RCU invokes dst_rcu_free 2085 times in a single batch
> (call_rcu_bh(&rt->u.dst.rcu_head, dst_rcu_free) is only called from
> rt_free() and rt_drop()).

Ok. This is likely something that was hidden by the RCU batch size thing,
but that in turn was effectively turned off because it caused
out-of-memory situations where a small batch size would cause the RCU
queues to grow without bounds (noticed when we started freeing the
dentries from RCU)..

We fixed that for "regular" RCU callbacks by noticing when the RCU queue
got long, and encouraging a RCU event when that happened. However, that
doesn't happen for the "call_rcu_bh()" case, so I'm not surprised that the
network queues can grow fairly long.

I've added Eric Dumazet, Dipankar and Paul to the Cc: list, and appended a
totally untested (and probably horribly buggy) possible patch as a
starting point for discussion. It just sets "need_resched()" in the hope
that we'll go through an RCU idle point and go the RCU callbacks. Whether
it helps your case or not, I have no clue.

Linus
---
diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
index 48d3bce..b107562 100644
--- a/kernel/rcupdate.c
+++ b/kernel/rcupdate.c
@@ -149,11 +149,10 @@ void fastcall call_rcu_bh(struct rcu_hea
*rdp->nxttail = head;
rdp->nxttail = &head->next;
rdp->count++;
-/*
- * Should we directly call rcu_do_batch() here ?
- * if (unlikely(rdp->count > 10000))
- * rcu_do_batch(rdp);
- */
+
+ if (unlikely(++rdp->count > 100))
+ set_need_resched();
+
local_irq_restore(flags);
}

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/