Re: BNX2: Kernel crashes with 2.6.31 and 2.6.31.9

From: Brian Haley
Date: Wed Mar 10 2010 - 21:10:38 EST


Michael Chan wrote:
> On Wed, 2010-03-10 at 15:09 -0800, Brian Haley wrote:
>> Brian Haley wrote:
>>> Hi Michael,
>>>
>>> Michael Chan wrote:
>>>> Do we have timers running in this environment? The timer in the bnx2
>>>> driver, bnx2_timer(), needs to run to provide a heart beat to the
>>>> firmware. In netpoll mode without timer interrupts, if we are regularly
>>>> calling the NAPI poll function, it should also be able to provide the
>>>> heartbeat. Without the heartbeat, the firmware will reset the chip and
>>>> result in the NETDEV WATCHDOG.
>>> We have also been seeing watchdog timeouts with bnx2, below is a
>>> stack trace with Benjamin's debug patch applied. Normally we were
>>> only seeing them under heavy load, but this one was at boot. We haven't
>>> tried the latest firmware/driver from 2.6.33 yet. You can contact me
>>> offline if you need more detailed info.
>> Following-up since I have more info on this issue.
>>
>> I'm able to cause a netdev_watchdog timeout by changing the coalesce
>> settings on my bnx2, I built a little test program for it:
>
> Do you run this program in a loop? How quickly do you see the NETDEV
> WATCHDOG?

It's run once, and we see it almost immediately after ETHTOOL_SCOALESCE.

>> ecoal.rx_coalesce_usecs = 0;
>> ecoal.rx_max_coalesced_frames = 1;
>> ecoal.rx_coalesce_usecs_irq = 0;
>> ecoal.rx_max_coalesced_frames_irq = 1;
>
> These rx settings should be ok. Did you change the tx settings? If the
> tx settings are all zeros, you won't get any TX interrupts and you can
> get a NETDEV WATCHDOG.

We did the read, so the TX should be what it was originally.

> Run ethtool -c eth0 to see what the tx settings are. Thanks.

# ethtool -c eth0
Coalesce parameters for eth0:
Adaptive RX: off TX: off
stats-block-usecs: 999936
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 0
rx-frames: 1
rx-usecs-irq: 0
rx-frames-irq: 1

tx-usecs: 80
tx-frames: 20
tx-usecs-irq: 18
tx-frames-irq: 2

rx-usecs-low: 0
rx-frame-low: 0
tx-usecs-low: 0
tx-frame-low: 0

rx-usecs-high: 0
rx-frame-high: 0
tx-usecs-high: 0
tx-frame-high: 0

If I run 'ethtool -c eth0' after the watchdog triggers either the NIC
or system completely hangs.

-Brian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/