Re: tracing kernel panics

From: Bryan Donlan
Date: Sat Jun 18 2011 - 23:45:28 EST


On Sat, Jun 18, 2011 at 20:12, Shane
<software.research.development@xxxxxxxxx> wrote:
> Anyone offer advice on how I should go about tracking down this kernel
> panic? Apologies if I've got the wrong list, let me know.
>
> I'm developing a networking module, my own protocol over TCP. I load
> my module, I make a network connection, and then I close it. I do
> nothing more, then after about 1-2 minutes, it throws this panic
> output and none of it seems to come from my module code. I know my
> code is the problem, and if I don't run my modules, I never get
> panics. But nothing in the stack trace I recognize from my code and
> I'm having a very hard time find where in my code I've gone wrong. I
> run my kernel modules in a VM as a guest OS which connects to another
> guest OS (currently using 2.6.36-r5).
>
> I've read how to analyze an OOPS, but ... here I can't even find the
> file that this might belong to so as to disassemble it, or understand
> what device this is. Any suggestions/pointers/advice much appreciated?
[...]
> [  330.829809]  [<ffffffff8147f763>] arp_process+0x1a8/0x4e9
> [  330.829809]  [<ffffffff8104b236>] ? pvclock_clocksource_read+0x4b/0xb4
> [  330.829809]  [<ffffffff8147f5bb>] ? arp_process+0x0/0x4e9

First, try turning on frame pointers (CONFIG_FRAME_POINTER) - without
them the stack traces can be a bit unreliable, as we can see here.

If you still need help after turning on frame pointers and getting a
clean trace, you may want to consider cross-posting to
linux-net@xxxxxxxxxxxxxxx and kernelnewbies@xxxxxxxxxxxxxxxxx as well.


Also, if you're just doing this as an exercise in learning kernel APIs
ignore this, but if this is for a real application, please consider
implementing TCP-based protocols in userspace instead of the kernel;
for most use cases there's no real need to put it down in the kernel
(with some exceptions, such as network filesystems).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/