Re: radeon.ko/i586: BUG: kernel NULL pointer dereference,address:00000004

From: Linux regression tracking (Thorsten Leemhuis)
Date: Tue Aug 29 2023 - 08:09:03 EST


Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
for once, to make this easily accessible to everyone.

I still have this issue on my list of tracked regressions.

Was this fixed in between? Doesn't look like it from here, but I might
be missing something.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

On 23.07.23 16:32, Steven Rostedt wrote:
> On Sun, 23 Jul 2023 20:55:06 +0900
> <kkabe@xxxxxxxxxxx> wrote:
>
>> So I tried to trap NULL and return:
>>
>> ================ patch-drm_vblank_cancel_pending_works-printk-NULL-ret.patch
>> diff -up ./drivers/gpu/drm/drm_vblank_work.c.pk2 ./drivers/gpu/drm/drm_vblank_work.c
>> --- ./drivers/gpu/drm/drm_vblank_work.c.pk2 2023-06-06 20:50:40.000000000 +0900
>> +++ ./drivers/gpu/drm/drm_vblank_work.c 2023-07-23 14:29:56.383093673 +0900
>> @@ -71,6 +71,10 @@ void drm_vblank_cancel_pending_works(str
>> {
>> struct drm_vblank_work *work, *next;
>>
>> + if (!vblank->dev) {
>> + printk(KERN_WARNING "%s: vblank->dev == NULL? returning\n", __func__);
>> + return;
>> + }
>> assert_spin_locked(&vblank->dev->event_lock);
>>
>> list_for_each_entry_safe(work, next, &vblank->pending_work, node) {
>> ================
>>
>> This time, the printk trap does not happen!! and radeon.ko works.
>> (NULL check for vblank->worker is still fireing though)
>>
>> Now this is puzzling.
>> Is this a timing issue?
>
> It could very well be. And the ftrace patch could possibly not be the
> cause at all. But the thread that is created to do the work is causing
> the race window to be opened up, which is why you see it with the patch
> and don't without it. It may not be the problem, it may just tickle the
> timings enough to trigger the bug, and is causing you to go on a wild
> goose chase in the wrong direction.
>
> -- Steve
>
>
>> Is systemd-udevd doing something not favaorble to kernel?
>> Is drm vblank code running without enough initialization?
>>
>> Puzzling is, that purely useland activity
>> (logging in on tty1 before radeon.ko load)
>> is affecting kernel panic/no-panic.
>
>
>