Re: arm: TI BeagleBoard X15 : Unable to handle kernel NULL pointer dereference at virtual address 00000369 - Internal error: Oops: 5 [#1] SMP ARM

From: Arnd Bergmann
Date: Fri Nov 11 2022 - 03:45:50 EST


On Fri, Nov 11, 2022, at 07:28, Naresh Kamboju wrote:
> On Thu, 10 Nov 2022 at 03:33, Arnd Bergmann <arnd@xxxxxxxx> wrote:
>>
>> One more idea I had is the unwinder: since this kernel is built
>> with the frame-pointer unwinder, I think the stack usage per
>> function is going to be slightly larger than with the arm unwinder.
>>
>> Naresh, how hard is it to reproduce this bug intentionally?
>> Can you try if it still happens if you change the .config to
>> use these:?
>>
>> # CONFIG_FUNCTION_GRAPH_TRACER is not set
>> # CONFIG_UNWINDER_FRAME_POINTER is not set
>> CONFIG_UNWINDER_ARM=y
>
> I have done this experiment and reported crash not reproduced
> after eight rounds of testing [1].
>
> https://lkft.validation.linaro.org/scheduler/job/5835922#L1993

Ok, good to hear. In this case, I see three possible ways forward
to prevent this from coming back on your system:

a) use asynchronous probing for one or more of the drivers as
Dmitry suggested. This means fixing it upstream first and then
backporting the fix to all stable kernels. We should probably
do this anyway, but this will need more testing on your side.

b) Change your kernel config permanently with the options above,
if LKFT does not actually rely on CONFIG_FUNCTION_GRAPH_TRACER.
I don't know if it does.

c) backport commit 41918ec82eb6 ("ARM: ftrace: enable the graph
tracer with the EABI unwinder") from 5.17. This was part of
a longer series from Ard, and while the patch itself looks
simple enough to be backported, I suspect we'd have to
backport the entire series, which is probably not going to
be realistic. Ard, any comments on this?

Arnd