Re: [PATCH 5.19 000/101] 5.19.13-rc1 review

From: Feng Tang
Date: Wed Oct 05 2022 - 05:39:20 EST


On Tue, Oct 04, 2022 at 12:18:05PM +0530, Naresh Kamboju wrote:
> On Mon, 3 Oct 2022 at 12:43, Greg Kroah-Hartman
> <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> >
> > This is the start of the stable review cycle for the 5.19.13 release.
> > There are 101 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Wed, 05 Oct 2022 07:07:06 +0000.
> > Anything received after that time might be too late.
> >
> > The whole patch series can be found in one patch at:
> > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.19.13-rc1.gz
> > or in the git tree and branch at:
> > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.19.y
> > and the diffstat can be found below.
> >
> > thanks,
> >
> > greg k-h
>
> Results from Linaro's test farm.
> No regressions on arm64, arm, x86_64, and i386.
>
> Tested-by: Linux Kernel Functional Testing <lkft@xxxxxxxxxx>
>
> NOTE:
> 1) Build warning
> 2) Boot warning on qemu-arm64 with KASAN and Kunit test
> Suspecting one of the recently commits causing this warning and
> need to bisect to confirm the commit id.
> mm/slab_common: fix possible double free of kmem_cache
> [ Upstream commit d71608a877362becdc94191f190902fac1e64d35 ]

Hi Naresh Kamboju,

Thanks for the report!

Could you try reverting the commit and re-test it to confirm?

Also could you provide the kernel dmesg of the failure and the
kernel config of the test?

I locally pulled the linux-stable source and used QEMU to test
it with kasan/kfence enabled, but could not reproduce it (I
only have x86 HW at hand).

> 2) Following kernel boot warning noticed on qemu-arm64 with KASAN and
> KUNIT enabled [1]
>
> [ 177.651182] ------------[ cut here ]------------
> [ 177.652217] kmem_cache_destroy test: Slab cache still has
> objects when called from test_exit+0x28/0x40
> [ 177.654849] WARNING: CPU: 0 PID: 1 at mm/slab_common.c:520
> kmem_cache_destroy+0x1e8/0x20c
> [ 177.666237] Modules linked in:
> [ 177.667325] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G B
> 5.19.13-rc1 #1
> [ 177.668666] Hardware name: linux,dummy-virt (DT)
> [ 177.669783] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT
> -SSBS BTYPE=--)
> [ 177.671120] pc : kmem_cache_destroy+0x1e8/0x20c
> [ 177.672217] lr : kmem_cache_destroy+0x1e8/0x20c
> [ 177.691598] Call trace:
> [ 177.692165] kmem_cache_destroy+0x1e8/0x20c
> [ 177.693196] test_exit+0x28/0x40
> [ 177.694158] kunit_catch_run_case+0x5c/0x120
> [ 177.695177] kunit_try_catch_run+0x144/0x26c
> [ 177.696211] kunit_run_case_catch_errors+0x158/0x1e0
> [ 177.697353] kunit_run_tests+0x374/0x750
> [ 177.698333] __kunit_test_suites_init+0x74/0xa0
> [ 177.699386] kunit_run_all_tests+0x160/0x380
> [ 177.700428] kernel_init_freeable+0x32c/0x388
> [ 177.701497] kernel_init+0x2c/0x150
> [ 177.702347] ret_from_fork+0x10/0x20
> [ 177.703308] ---[ end trace 0000000000000000 ]---
>
> [1] https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/2FcCyacq1SusUcnAfamULqzkdUA

I also tried the reproduce cmmand from the above link:

tuxrun --runtime podman --device qemu-arm64 --kernel https://builds.tuxbuild.com/2FcCwzbNgR7TlQXzJ0nu32y1CpB/Image.gz --modules https://builds.tuxbuild.com/2FcCwzbNgR7TlQXzJ0nu32y1CpB/modules.tar.xz --rootfs https://storage.lkft.org/rootfs/oe-kirkstone/20220824-114729/juno/lkft-tux-image-juno-20220824120304.rootfs.ext4.gz --parameters SKIPFILE=skipfile-lkft.yaml --image docker.io/lavasoftware/lava-dispatcher:2022.06 --tests kunit --timeouts boot=30

Which also didn't reproduce it, but had some RCU stall problems
(could also be related to the x86 HWs)

[ 321.006279] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 321.007281] ffff0000074c2300: 00 07 fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 321.009283] rcu: 0-...0: (1 GPs behind) idle=40f/1/0x4000000000000000 softirq=436/437 fqs=5

[ 321.024995] rcu: rcu_preempt kthread timer wakeup didn't happen for 4464 jiffies! g-207 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
[ 321.026343] rcu: Possible timer handling issue on cpu=1 timer-softirq=1426
[ 321.027340] rcu: rcu_preempt kthread starved for 4465 jiffies! g-207 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1
[ 321.028517] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
[ 321.029488] rcu: RCU grace-period kthread stack dump:
[ 321.030251] task:rcu_preempt state:I stack: 0 pid: 16 ppid: 2 flags:0x00000008
[ 321.031434] Call trace:
[ 321.031878] __switch_to+0x140/0x1e0
[ 321.032565] __schedule+0x4f4/0xc74
[ 321.033228] schedule+0x88/0x13c
[ 321.033915] schedule_timeout+0x104/0x2b0
[ 321.034646] rcu_gp_fqs_loop+0x1a0/0x784
[ 321.035119] rcu_gp_kthread+0x278/0x3a0
[ 321.035608] kthread+0x160/0x170
[ 339.882465] ret_from_fork+0x10/0x20
[ 339.883898] rcu: Stack dump where RCU GP kthread last ran:

The full .xz log is attched.

Thanks,
Feng

Attachment: stable-k519-kunit.log.xz
Description: application/xz