Re: deadlocks if use htb

From: Badalian Vyacheslav
Date: Thu Dec 18 2008 - 06:23:54 EST


Thanks for all Jarek!

Vyacheslav Badalian

> On Thu, Dec 18, 2008 at 09:43:51AM +0300, Badalian Vyacheslav wrote:
>
>> Hello
>> result: Patch 2+3 = uptime 7 days without crashes.
>> May i revert patches and try single new patch?
>>
>
> Here is my current opinion on this bug:
>
> 1) I'm almost sure it's not a htb, but hrtimers bug (some race),
>
> 2) the htb patches you've tested are not "the proper" way of fixing
> it; I see substantial changes in hrtimers code in the "-tip" tree
> (probably for 2.6.29), which, probably, you'll be advised by
> hrtimers maintainers to try, and I guess, it's not easy on a
> production system,
>
> So, it's up to you:
>
> 1) since these patches work for you, you can stop with testing and
> wait with these patched kernels until 2.6.29 (I can propose this
> #2 patch as a temporary fix then),
>
> 2) for curiosity you could try this patch #4 alone on one box first
> (after reverting at least patch #2), but again: if it works, it
> could be only treated as a temporary hack, and alternative of #2.
>
> Thanks,
> Jarek P.
>
>
>>> On Thu, Dec 11, 2008 at 08:46:06AM +0000, Jarek Poplawski wrote:
>>>
>>>
>>>> On Wed, Dec 10, 2008 at 06:14:28PM +0300, Badalian Vyacheslav wrote:
>>>>
>>>>
>>>>> Hello again! Sorry for long away.
>>>>>
>>>>>
>>>> Hi!
>>>>
>>>>
>>>>
>>>>> I was go away from this work for long time.
>>>>>
>>>>> May we return to this bug?
>>>>> Servers at last stable kernel 2.6.27.8
>>>>> HZ=1000, HR=off, DynamicTicks=off, hysteresis=1
>>>>> Sorry - no patched, update do not i. Do you have fresh patches or ideas
>>>>> for tests?
>>>>>
>>>>>
>>>> Not much, but I can have if you only are willing to test them...
>>>> I attach below a patch which combines 2 patches I sent yesterday to
>>>> netdev (PATCH 7/6 and 8/6) vs. 2.6.27.7 (named testing patch #3 here).
>>>>
>>>> You can still try the testing patch #2 I sent previously (quoted below)
>>>> with or without this new #3 patch.
>>>>
>>>>
>>>>
>>> Here is another idea worth checking (instead of patch #2).
>>>
>>> Jarek P.
>>>
>>> --- (testing patch #4)
>>>
>>> diff -Nurp a2.6.27.7/net/sched/sch_htb.c b2.6.27.7/net/sched/sch_htb.c
>>> --- a2.6.27.7/net/sched/sch_htb.c 2008-12-11 08:16:16.000000000 +0000
>>> +++ b2.6.27.7/net/sched/sch_htb.c 2008-12-15 10:44:32.000000000 +0000
>>> @@ -924,6 +924,7 @@ static struct sk_buff *htb_dequeue(struc
>>> }
>>> }
>>> sch->qstats.overlimits++;
>>> + qdisc_watchdog_cancel(&q->watchdog);
>>> qdisc_watchdog_schedule(&q->watchdog, next_event);
>>> fin:
>>> return skb;
>>>
>>>
>>>
>>>
>
>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/