Re: Re: [linus:master] [sched/eevdf] 2227a957e1: BUG:kernel_NULL_pointer_dereference,address

From: Abel Wu
Date: Wed Jan 31 2024 - 07:28:47 EST


On 1/31/24 8:10 PM, Tiwei Bie Wrote:
On 1/30/24 6:13 PM, Abel Wu wrote:
On 1/30/24 3:24 PM, kernel test robot Wrote:

[  512.079810][ T8305] BUG: kernel NULL pointer dereference, address: 0000002c
[  512.080897][ T8305] #PF: supervisor read access in kernel mode
[  512.081636][ T8305] #PF: error_code(0x0000) - not-present page
[  512.082337][ T8305] *pde = 00000000
[  512.082829][ T8305] Oops: 0000 [#1] PREEMPT SMP
[  512.083407][ T8305] CPU: 1 PID: 8305 Comm: watchdog Tainted: G        W        N 6.7.0-rc1-00006-g2227a957e1d5 #1 819e6d1a8b887f5f97adb4aed77d98b15504c836
[  512.084986][ T8305] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 512.086203][ T8305] EIP: set_next_entity (fair.c:?)

There was actually a NULL-test in pick_eevdf() before this commit,
but I removed it by intent as I found it impossible to be NULL after
examining 'all' the cases.

Also cc Tiwei who once proposed to add this check back.
https://lore.kernel.org/all/20231208112100.18141-1-tiwei.btw@xxxxxxxxxxxx/

Thanks for cc'ing me. That's the case I worried about and why I thought
it might be worthwhile to add the sanity check back. I just sent out a
new version of the above patch with updated commit log and error message.

I assuming the real problem is why it *can* be NULL at first place.
IMHO the NULL check with a fallback selection doesn't solve this, but
it indeed avoids kernel panic which is absolutely important.