Re: BUG: KASAN: stack-out-of-bounds in unwind_next_frame+0x1df5/0x2650

From: Ivan Babrou
Date: Wed Feb 03 2021 - 18:31:57 EST


On Wed, Feb 3, 2021 at 3:28 PM Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:
>
> On Wed, Feb 03, 2021 at 02:41:53PM -0800, Ivan Babrou wrote:
> > On Wed, Feb 3, 2021 at 11:05 AM Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:
> > >
> > > On Wed, Feb 03, 2021 at 09:46:55AM -0800, Ivan Babrou wrote:
> > > > > Can you pretty please not line-wrap console output? It's unreadable.
> > > >
> > > > GMail doesn't make it easy, I'll send a link to a pastebin next time.
> > > > Let me know if you'd like me to regenerate the decoded stack.
> > > >
> > > > > > edfd9b7838ba5e47f19ad8466d0565aba5c59bf0 is the first bad commit
> > > > > > commit edfd9b7838ba5e47f19ad8466d0565aba5c59bf0
> > > > >
> > > > > Not sure what tree you're on, but that's not the upstream commit.
> > > >
> > > > I mentioned that it's a rebased core-static_call-2020-10-12 tag and
> > > > added a link to the upstream hash right below.
> > > >
> > > > > > Author: Steven Rostedt (VMware) <rostedt@xxxxxxxxxxx>
> > > > > > Date: Tue Aug 18 15:57:52 2020 +0200
> > > > > >
> > > > > > tracepoint: Optimize using static_call()
> > > > > >
> > > > >
> > > > > There's a known issue with that patch, can you try:
> > > > >
> > > > > http://lkml.kernel.org/r/20210202220121.435051654@xxxxxxxxxxx
> > > >
> > > > I've tried it on top of core-static_call-2020-10-12 tag rebased on top
> > > > of v5.9 (to make it reproducible), and the patch did not help. Do I
> > > > need to apply the whole series or something else?
> > >
> > > Can you recreate with this patch, and add "unwind_debug" to the cmdline?
> > > It will spit out a bunch of stack data.
> >
> > Here's the three I'm building:
> >
> > * https://github.com/bobrik/linux/tree/ivan/static-call-5.9
> >
> > It contains:
> >
> > * v5.9 tag as the base
> > * static_call-2020-10-12 tag
> > * dm-crypt patches to reproduce the issue with KASAN
> > * x86/unwind: Add 'unwind_debug' cmdline option
> > * tracepoint: Fix race between tracing and removing tracepoint
> >
> > The very same issue can be reproduced on 5.10.11 with no patches,
> > but I'm going with 5.9, since it boils down to static call changes.
> >
> > Here's the decoded stack from the kernel with unwind debug enabled:
> >
> > * https://gist.github.com/bobrik/ed052ac0ae44c880f3170299ad4af56b
> >
> > See my first email for the exact commands that trigger this.
>
> Thanks. Do you happen to have the original dmesg, before running it
> through the post-processing script?

Yes, here it is:

* https://gist.github.com/bobrik/8c13e6a02555fb21cadabb74cdd6f9ab

> I assume you're using decode_stacktrace.sh? It could use some
> improvement, it's stripping the function offset.
>
> Also spaces are getting inserted in odd places, messing the alignment.
>
> [ 137.291837][ C0] ffff88809c409858: d7c4f3ce817a1700 (0xd7c4f3ce817a1700)
> [ 137.291837][ C0] ffff88809c409860: 0000000000000000 (0x0)
> [ 137.291839][ C0] ffff88809c409868: 00000000ffffffff (0xffffffff)
> [ 137.291841][ C0] ffff88809c409870: ffffffffa4f01a52 unwind_next_frame (arch/x86/kernel/unwind_orc.c:380 arch/x86/kernel/unwind_orc.c:553)
> [ 137.291843][ C0] ffff88809c409878: ffffffffa4f01a52 unwind_next_frame (arch/x86/kernel/unwind_orc.c:380 arch/x86/kernel/unwind_orc.c:553)
> [ 137.291844][ C0] ffff88809c409880: ffff88809c409ac8 (0xffff88809c409ac8)
> [ 137.291845][ C0] ffff88809c409888: 0000000000000086 (0x86)
>
> --
> Josh
>