Re: Suggestions on how to debug kernel crashes where printk and gdb both does not work

From: Pavel Skripkin
Date: Mon Jun 14 2021 - 10:25:22 EST


On Mon, 14 Jun 2021 22:19:10 +0800
Dongliang Mu <mudongliangabcd@xxxxxxxxx> wrote:

> On Mon, Jun 14, 2021 at 9:34 PM Pavel Skripkin <paskripkin@xxxxxxxxx>
> wrote:
> >
> > On Mon, 14 Jun 2021 21:22:43 +0800
> > Dongliang Mu <mudongliangabcd@xxxxxxxxx> wrote:
> >
> > > Dear kernel developers,
> > >
> > > I was trying to debug the crash - memory leak in hwsim_add_one [1]
> > > recently. However, I encountered a disgusting issue: my
> > > breakpoint and printk/pr_alert in the functions that will be
> > > surely executed do not work. The stack trace is in the following.
> > > I wrote this email to ask for some suggestions on how to debug
> > > such cases?
> > >
> > > Thanks very much. Looking forward to your reply.
> > >
> >
> > Hi, Dongliang!
> >
> > This bug is not similar to others on the dashboard. I spent some
> > time debugging it a week ago. The main problem here, that memory
> > allocation happens in the boot time:
> >
> > > [<ffffffff84359255>] kernel_init+0xc/0x1a7 init/main.c:1447
> >
>
> Oh, nice catch. No wonder why my debugging does not work. :(
>
> > and reproducer simply tries to
> > free this data. You can use ftrace to look at it. Smth like this:
> >
> > $ echo 'hwsim_*' > $TRACE_DIR/set_ftrace_filter
>
> Thanks for your suggestion.
>
> Do you have any conclusions about this case? If you have found out the
> root cause and start writing patches, I will turn my focus to other
> cases.

No, I had some busy days and I have nothing about this bug for now.
I've just traced the reproducer execution and that's all :)

I guess, some error handling paths are broken, but Im not sure


>
> BTW, I only found another possible memory leak after some manual code
> review [1]. However, it is not the root cause for this crash.
>
> [1] https://lkml.org/lkml/2021/6/10/1297
>
> >
> > would work.
> >
> >
> > With regards,
> > Pavel Skripkin




With regards,
Pavel Skripkin