Re: [PATCH v5 6/9] slub: Delay freezing of partial slabs

From: Mark Brown
Date: Tue Nov 21 2023 - 13:21:49 EST


On Tue, Nov 21, 2023 at 11:47:26PM +0800, Chengming Zhou wrote:

> Ah yes, there is no NMI on ARM, so CPU 3 maybe running somewhere with
> interrupts disabled. I searched the full log, but still haven't a clue.
> And there is no any WARNING or BUG related to SLUB in the log.

Yeah, nor anything else particularly. I tried turning on some debug
options:

CONFIG_SOFTLOCKUP_DETECTOR=y
CONFIG_DETECT_HUNG_TASK=y
CONFIG_WQ_WATCHDOG=y
CONFIG_DEBUG_PREEMPT=y
CONFIG_DEBUG_LOCKING=y
CONFIG_DEBUG_ATOMIC_SLEEP=y

https://validation.linaro.org/scheduler/job/4017828

which has some additional warnings related to clock changes but AFAICT
those come from today's -next rather than the debug stuff:

https://validation.linaro.org/scheduler/job/4017823

so that's not super helpful.

> I wonder how to reproduce it locally with a Qemu VM since I don't have
> the ARM machine.

There's sample qemu jobs available from for example KernelCI:

https://storage.kernelci.org/next/master/next-20231120/arm/multi_v7_defconfig/gcc-10/lab-baylibre/baseline-qemu_arm-virt-gicv3.html

(includes the command line, though it's not using Debian testing like my
test was). Note that I'm testing a bunch of platforms with the same
kernel/rootfs combination and it was only the Raspberry Pi 3 which blew
up. It is a bit tight for memory which might have some influence?

I'm really suspecting this may have made some underlying platform bug
more obvious :/

Attachment: signature.asc
Description: PGP signature