Re: [linus:master] [sched/eevdf] 2227a957e1: BUG:kernel_NULL_pointer_dereference,address

From: Honglei Wang
Date: Wed Jan 31 2024 - 21:53:00 EST




On 2024/2/1 09:54, Oliver Sang wrote:
hi, Honglei,

On Thu, Feb 01, 2024 at 09:29:30AM +0800, Honglei Wang wrote:


On 2024/1/30 22:09, Oliver Sang wrote:
hi, Abel,

On Tue, Jan 30, 2024 at 06:13:32PM +0800, Abel Wu wrote:
On 1/30/24 3:24 PM, kernel test robot Wrote:


Hello,

(besides a previous performance report),
kernel test robot noticed "BUG:kernel_NULL_pointer_dereference,address" on:

commit: 2227a957e1d5b1941be4e4207879ec74f4bb37f8 ("sched/eevdf: Sort the rbtree by virtual deadline")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

[test failed on linus/master 3a5879d495b226d0404098e3564462d5f1daa33b]
[test failed on linux-next/master 01af33cc9894b4489fb68fa35c40e9fe85df63dc]

in testcase: trinity
version: trinity-i386-abe9de86-1_20230429

Hi Oliver,

I'm a bit curious, did the problem happen on i386 only? Did you hit it on
x86_64 or other platform with the same trinity testcases?

we did not observe same issue on x86_64 so far.

we can run performance tests with this commit on x86_64 (compiled with gcc-12).
FYI, we sent out a performance report before this crash one.

https://lore.kernel.org/all/202401292151.829b01b0-oliver.sang@xxxxxxxxx/


Thanks for the feedback. The performance improvement is as expected. I assume the panic is not introduced by 2227a957e1. We don't know where the "EEVDF scheduling fail, picking leftmost" messages come from even before this patch.

It would be great if we can find a way to reproduce the problem. Seems it's worth a try on VM with fewer cpus and a bit more syscall workload like testcase trinity.


Thanks,
Honglei