Re: boot failure, "DWARF2 unwinder stuck at 0xc0100199"

From: Jan Beulich
Date: Mon Aug 21 2006 - 02:45:27 EST

>>> "J. Bruce Fields" <bfields@xxxxxxxxxxxx> 20.08.06 03:31 >>>
>As of 2.6.18-rc3, one of my test machines stopped booting. I'm not
>seeing the whole OOPS (I could probably set up a serial console if
>necessary), but it ends in something like:
>DWARF2 unwinder stuck at 0xc0100199
>Leftover inexact backtrace:
> =======================
> BUG: unable to handle kernel paging request at virtual address 0000b034
> printing eip:
>*pde = 00000000
>Recursive die() failure, output suppressed
> <0>Kernel panic - not syncing: Fatal exception in interrupt
>Bisecting, it looks like this starts happening after c97d20a...,
>"[PATCH] i386: Do backtrace fallback too", though it's a little tricky
>since the compile is broken near there for a little while.
>Kernel config appended; let me know if anything else would be useful.

The 'stuck' unwinder issue at hand already has a fix, though planned to
be merged for 2.6.19 only. The crash after switching to the legacy
stack trace code is bad, though, but has little to do with the unwinder
additions/changes. The way that code reads the stack is just
inappropriate in contexts where things must be expected to be broken.

Finally, there is no visible correlation between the original problem (in
or from trace_hardirqs_on) and the unwinder - once that problem is
fixed, you're not likely to see the recursive die failure anymore either.

