RE: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang

From: Brandeburg, Jesse
Date: Tue Jan 15 2008 - 16:54:23 EST


slavon@xxxxxxxxxxxxx wrote:
> Quoting Frans Pop <elendil@xxxxxxxxx>:
>>> (Note this isn't the final correct patch we should apply. There is
>>> no reason why this revert back to the older ->poll() logic here
>>> should have any effect on the TX hang triggering...)
>>
>> s/no reason/no obvious reason/ ? ;-)

The tx code has an "early exit" that tries to limit the amount of tx
packets handled in a single poll loop and requires napi or interrupt
rescheduling based on the return value from e1000_clean_tx_irq.

see this code in e1000_clean_tx_irq

4005 #ifdef CONFIG_E1000_NAPI
4006 #define E1000_TX_WEIGHT 64
4007 > > /* weight of a sort for tx, to avoid endless
transmit cleanup */
4008 > > if (count++ == E1000_TX_WEIGHT) break;
4009 #endif

I think that is probably related. For a test you could apply the
original patch, and remove this "break" just by commenting out line
4008. This would guarantee all tx work is cleaned at every e1000_clean

Jesse
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/