Re: Revert "gro: Fix legacy path napi_complete crash",

From: Ingo Molnar
Date: Wed Mar 25 2009 - 03:34:27 EST



* David Miller <davem@xxxxxxxxxxxxx> wrote:

> From: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>
> Date: Wed, 25 Mar 2009 08:23:03 +0800
>
> > On Tue, Mar 24, 2009 at 02:36:22PM -0700, David Miller wrote:
> > >
> > > I think the problem is that we need to do the GRO flush before the
> > > list delete and clearing the NAPI_STATE_SCHED bit.
> >
> > Well first of all GRO shouldn't even be on in Ingo's case, unless
> > he enabled it by hand with ethtool. Secondly the only thing that
> > touches the GRO state for the legacy path is process_backlog, and
> > since this is per-cpu, I can't see how another instance can run
> > while the first is still going.
>
> Right.
>
> I think the conditions Ingo is running under is that both loopback
> (using legacy paths) and his NAPI based device (forcedeth) are
> processing a lot of packets at the same time.
>
> Another thing that seems to be critical is he can only trigger
> this on UP, which means that we don't have the damn APIC
> potentially moving the cpu target of the forcedeth interrupts
> around. And this means also that all the processing will be on
> one cpu's backlog queue only.

I tested the plain revert i sent in the original report overnight
(with about 12 hours of combined testing time), and all systems held
up fine. The system that would reproduce the bug within 10-20
iterations did 210 successful iterations. Other systems held up fine
too.

So if there's no definitive resolution for the real cause of the
bug, the plain revert looks like an acceptable interim choice for
.29.1 - at least as far as my systems go.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/