Re: stop_machine lockup issue in 3.9.y.

From: Eric Dumazet
Date: Wed Jun 05 2013 - 23:46:54 EST


On Wed, 2013-06-05 at 20:41 -0700, Ben Greear wrote:
> On 06/05/2013 08:26 PM, Eric Dumazet wrote:
> > On Wed, 2013-06-05 at 20:14 -0700, Tejun Heo wrote:
> >
> >>
> >> Ah, so, that's why it's showing up now. We probably have had the same
> >> issue all along but it used to be masked by the softirq limiting. Do
> >> you care to revive the 10 iterations limit so that it's limited by
> >> both the count and timing? We do wanna find out why softirq is
> >> spinning indefinitely tho.
> >
> > Yes, no problem, I can do that.
>
> Limiting it to 5000 fixes my problem, so if you wanted it larger than 10, that would
> be fine by me.
>
> I can send a version of my patch easily enough if we can agree on the max number of
> loops (and if indeed my version of the patch is acceptable).

Well, 10 was the prior limit and seems really fine.

The non update on jiffies seems quite exceptional condition (I hope...)

We use in Google a patch triggering warning is a thread holds the cpu
without taking care to need_resched() for more than xx ms



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/