Re: [PATCH 5.15 000/159] 5.15.116-rc1 review

From: Guenter Roeck
Date: Sat Jun 10 2023 - 17:14:13 EST


Hi,

On 6/10/23 12:23, Pavel Machek wrote:
Hi!

Build results:
total: 155 pass: 155 fail: 0
Qemu test results:
total: 499 pass: 498 fail: 1
Failed tests:
arm:kudo-bmc:multi_v7_defconfig:npcm:usb0.1:nuvoton-npcm730-kudo:rootfs

The test failure is spurious and not new. I observe it randomly on
multi_v7_defconfig builds, primarily on npcm platforms. There is no error
message, just a stalled boot. I have been trying to bisect for a while,
but I have not been successful so far. No immediate concern; I just wanted
to mention it in case someone else hits the same or a similar problem.


I managed to revise my bisect script sufficiently enough to get reliable
results. It looks like the culprit is commit 503e554782c9 (" debugobject:
Ensure pool refill (again)"); see bisect log below. Bisect on four
different systems all have the same result. After reverting this patch,
I do not see the problem anymore (again, confirmed on four different
systems). If anyone has an idea how to debug this, please let me know.
I'll be happy to give it a try.

You may want to comment out debug_objects_fill_pool() in
debug_object_activate or debug_object_assert_init to see which one is
causing the failure...

CONFIG_PREEMPT_RT is disabled for you, right? (Should 5.15 even have
that option?)


CONFIG_PREEMPT_RT is disabled (it depends on ARCH_SUPPORTS_RT which is not
enabled by any architecture in v5.15.y).

The added call in debug_object_activate() triggers the problem.
Any idea what to do about it or how to debug it further ?

Thanks,
Guenter