Re: BUG-lockdep and freeze (was: Arrr! Linux 2.6.18)

From: Jan Beulich
Date: Wed Oct 04 2006 - 05:21:23 EST


>>> Linus Torvalds <torvalds@xxxxxxxx> 30.09.06 23:11 >>>
>It could - and _should_ dammit! - do some basic sanity tests like "is the
>thing even in the same stack page"? But nooo... It seems _designed_ to be
>fragile and broken.

Your opinion. Mine's different, as the unwind directives are specifically
meant to allow changing to a different stack page. Also, there shouldn't
be any potential for corrupt unwind data (once all annotations are
correct, the lack or incorrectness of which I'd rather attribute as bugs
of the old code or, in one case, the compiler) as long as it's being
properly write protected; corruption of the stack should be affecting
the quality of old and new style back traces in similar ways.

>Here's a simple test: if the next stack-slot isn't on the same page, the
>unwind information is bogus unless you had the IRQ stack-switch signature
>there. Does the code do that? No. It just assumes that unwind information
>is complete and perfect.
>
>That's not the kind of code we write in the kernel. In the kernel, we
>write code that _works_, regardless of the kind of horrible stuff people
>feed it. That's _doubly_ true for something like a stack frame debugger,
>which is invoced when there is trouble, and for all we know the stack
>itself MIGHT BE CORRUPT.
>
>In short, I think the stack unwinder is just _broken_. It has made all the
>wrong policy decisions - it only works when everything is perfect, yet
>it's actually meant to be _used_ when somethign bad happened. Doesn't that
>strike anybody else as a totally flawed design?

Again, a corrupt stack will not allow you getting reliable data out of the
old unwinder either. Even worse when you consider a stack overflow and
your request for range checks (or pointers into the stack) - you might not
get a stack trace then at all.

Jan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/