RE: Oops: 17 SMP ARM (v3.16-rc2)

From: Mattis Lorentzon
Date: Wed Aug 13 2014 - 09:39:35 EST


Fabio and Russell,

> In order to try to narrow down whether this is a board issue, could you try to
> run the same kernel on a mx6q development board, such as mx6qsabresd,
> cubox-i, wandboard, etc?

Indeed, we have a Sabrelite development board and have run the same kernel
configuration (please find attached). Russells 30 FEC related patches are applied.
We have also tried with and without the extended interrupts entry in the DT.

All our tests seem to behave the same way on the Sabrelite as on our own board.
A working theory is that the switch (3Com Switch 4400) triggers the degeneration
of the network stack from which Linux does not seem to recover, even if we later
bypass the switch and directly connect the board to the server machine.

Since the problem is stochastic in nature we are not completely sure if we can
trigger the problem without the switch. It's the switch that allows us to run many
cards simultaneously and thus trigger the problem more easily. :-)

What are your thoughts?

Best regards,
Mattis Lorentzon

***************************************************************
Consider the environment before printing this message.

To read Autoliv's Information and Confidentiality Notice, follow this link:
http://www.autoliv.com/disclaimer.html
***************************************************************

Attachment: config.gz
Description: config.gz