[patch 2/3] net: sanitize hrtimer usage in sched_cbq

From: Thomas Gleixner
Date: Thu Jul 09 2009 - 18:00:28 EST


The usage of hrtimer in cbq_ovl_delay() is less than obvious and in
two aspects wrong.

The intention of the code is to arm an hrtimer with the expiry time
X. If the timer is already armed the code needs to check whether the
expiry time X is earlier than the expiry time of the armed timer and
either keep the active timer or rearm it to X. If the timer is not
armed it needs to schedule it to X. That's not what the code does.

It calls hrtimer_try_to_cancel() unconditionally and checks the return
value. If the return value is non zero then it compares the expiry
time of the timer with the new expiry time and picks the earlier
one. The return value check does not take into account that the timer
might run its callback (expressed by a return value of -1). In that
case the expiry time of the timer is probably earlier than the new
expiry time so it rearms the already expired timer with the expiry
value in the past.

If the timer is not active (hrtimer_try_to_cancel() returns 0) it does
not set the new expiry time X but instead restarts the timer with the
expiry time which was active when the timer fired last. That's in the
past as well.

Change the code to check whether the timer is enqueued. If it is
enqueued then the expiry time of the timer is checked against the new
expiry time and it only calls hrtimer_start when the new expiry time
is earlier than the already armed timer. If the timer is not active
then arm it unconditionally with the new expiry time.

Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Patrick McHardy <kaber@xxxxxxxxx>
---
net/sched/sch_cbq.c | 21 ++++++++++++++-------
1 file changed, 14 insertions(+), 7 deletions(-)

Index: linux-2.6/net/sched/sch_cbq.c
===================================================================
--- linux-2.6.orig/net/sched/sch_cbq.c
+++ linux-2.6/net/sched/sch_cbq.c
@@ -494,7 +494,6 @@ static void cbq_ovl_delay(struct cbq_cla

if (!cl->delayed) {
psched_time_t sched = q->now;
- ktime_t expires;

delay += cl->offtime;
if (cl->avgidle < 0)
@@ -504,6 +503,7 @@ static void cbq_ovl_delay(struct cbq_cla
cl->undertime = q->now + delay;

if (delay > 0) {
+ ktime_t expires, actexp;
unsigned long flags;

spin_lock_irqsave(&q->lock, flags);
@@ -514,12 +514,19 @@ static void cbq_ovl_delay(struct cbq_cla

expires = ktime_set(0, 0);
expires = ktime_add_ns(expires, PSCHED_TICKS2NS(sched));
- if (hrtimer_try_to_cancel(&q->delay_timer) &&
- ktime_to_ns(ktime_sub(
- hrtimer_get_expires(&q->delay_timer),
- expires)) > 0)
- hrtimer_set_expires(&q->delay_timer, expires);
- hrtimer_restart(&q->delay_timer);
+ /*
+ * If the timer is queued check whether the
+ * new expiry time is earlier than the current
+ * one.
+ */
+ if (hrtimer_is_queued(&q->delay_timer)) {
+ actexp = hrtimer_get_expires(&q->delay_timer);
+ if (expires.tv64 >= actexp.tv64)
+ expires.tv64 = 0;
+ }
+ if (expires.tv64)
+ hrtimer_start(&q->delay_timer, expires,
+ HRTIMER_MODE_ABS);
cl->delayed = 1;
cl->xstats.overactions++;
spin_unlock_irqrestore(&q->lock, flags);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/