Re: [PATCH/RFC] debugobjects/slub: Print slab info and backtrace.

From: kernel test robot
Date: Thu Nov 09 2023 - 00:21:16 EST




Hello,

kernel test robot noticed "WARNING:possible_recursive_locking_detected" on:

commit: d132b3de19193c996402718d76e63f797d4259a9 ("[PATCH/RFC] debugobjects/slub: Print slab info and backtrace.")
url: https://github.com/intel-lab-lkp/linux/commits/greearb-candelatech-com/debugobjects-slub-Print-slab-info-and-backtrace/20231103-093945
base: git://git.kernel.org/cgit/linux/kernel/git/vbabka/slab.git for-next
patch link: https://lore.kernel.org/all/20231103013704.1232723-1-greearb@xxxxxxxxxxxxxxx/
patch subject: [PATCH/RFC] debugobjects/slub: Print slab info and backtrace.

in testcase: boot

compiler: gcc-7
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

(please refer to attached dmesg/kmsg for entire log/backtrace)


the issue doesn't always happen for this commit in our tests (shows 39 times
in 50 runs), but keeps clean on parent.

90f055df112162fd d132b3de19193c996402718d76e
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
:51 76% 39:50 dmesg.WARNING:possible_recursive_locking_detected
:51 76% 39:50 dmesg.WARNING:possible_recursive_locking_detected_swapper_is_trying_to_acquire_lock:at:__debug_check_no_obj_freed_but_task_is_already_holding_lock:at:__debug_object_init/0x



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
| Closes: https://lore.kernel.org/oe-lkp/202311091331.f2eaf4f7-oliver.sang@xxxxxxxxx


[ 2.997328][ T1] WARNING: possible recursive locking detected
[ 2.997927][ T1] 6.6.0-rc4-00006-gd132b3de1919 #1 Tainted: G T
[ 2.997927][ T1] --------------------------------------------
[ 2.997927][ T1] swapper/1 is trying to acquire lock:
[ 2.997927][ T1] ffffffff85425790 (&obj_hash[i].lock){..-.}-{2:2}, at: __debug_check_no_obj_freed (lib/debugobjects.c:1057)
[ 2.997927][ T1]
[ 2.997927][ T1] but task is already holding lock:
[ 2.997927][ T1] ffffffff8531abc0 (&obj_hash[i].lock){..-.}-{2:2}, at: __debug_object_init (lib/debugobjects.c:528 lib/debugobjects.c:667)
[ 2.997927][ T1]
[ 2.997927][ T1] other info that might help us debug this:
[ 2.997927][ T1] Possible unsafe locking scenario:
[ 2.997927][ T1]
[ 2.997927][ T1] CPU0
[ 2.997927][ T1] ----
[ 2.997927][ T1] lock(&obj_hash[i].lock);
[ 2.997927][ T1]
[ 2.997927][ T1] *** DEADLOCK ***
[ 2.997927][ T1]
[ 2.997927][ T1] May be due to missing lock nesting notation
[ 2.997927][ T1]
[ 2.997927][ T1] 1 lock held by swapper/1:
[ 2.997927][ T1] #0: ffffffff8531abc0 (&obj_hash[i].lock){..-.}-{2:2}, at: __debug_object_init (lib/debugobjects.c:528 lib/debugobjects.c:667)
[ 2.997927][ T1]
[ 2.997927][ T1] stack backtrace:
[ 2.997927][ T1] CPU: 0 PID: 1 Comm: swapper Tainted: G T 6.6.0-rc4-00006-gd132b3de1919 #1 2ef50b864013db82ef063ed2d8014756ac6c0996
[ 2.997927][ T1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 2.997927][ T1] Call Trace:
[ 2.997927][ T1] <TASK>
[ 2.997927][ T1] __lock_acquire (kernel/locking/lockdep.c:5137)
[ 2.997927][ T1] ? local_clock (arch/x86/include/asm/preempt.h:85 kernel/sched/clock.c:316)
[ 2.997927][ T1] ? lock_release (kernel/locking/lockdep.c:223 kernel/locking/lockdep.c:352 kernel/locking/lockdep.c:5435 kernel/locking/lockdep.c:5773)
[ 2.997927][ T1] lock_acquire (kernel/locking/lockdep.c:467 kernel/locking/lockdep.c:5755)
[ 2.997927][ T1] ? __debug_check_no_obj_freed (lib/debugobjects.c:1057)
[ 2.997927][ T1] _raw_spin_lock_irqsave (include/linux/spinlock_api_smp.h:111 kernel/locking/spinlock.c:162)
[ 2.997927][ T1] ? __debug_check_no_obj_freed (lib/debugobjects.c:1057)
[ 2.997927][ T1] __debug_check_no_obj_freed (lib/debugobjects.c:1057)
[ 2.997927][ T1] ? local_clock (arch/x86/include/asm/preempt.h:85 kernel/sched/clock.c:316)
[ 2.997927][ T1] ? lock_release (kernel/locking/lockdep.c:223 kernel/locking/lockdep.c:352 kernel/locking/lockdep.c:5435 kernel/locking/lockdep.c:5773)
[ 2.997927][ T1] free_unref_page_prepare (mm/page_alloc.c:1146 mm/page_alloc.c:2312)
[ 2.997927][ T1] ? find_held_lock (kernel/locking/lockdep.c:5243)
[ 2.997927][ T1] free_unref_page (mm/page_alloc.c:2405)
[ 2.997927][ T1] __stack_depot_save (lib/stackdepot.c:443)
[ 2.997927][ T1] set_track_prepare (mm/kmemleak.c:620)
[ 2.997927][ T1] ? pm_runtime_init (drivers/base/power/runtime.c:1731)
[ 2.997927][ T1] ? device_initialize (drivers/base/core.c:3102)
[ 2.997927][ T1] ? device_register (drivers/base/core.c:3706)
[ 2.997927][ T1] ? subsys_register (drivers/base/bus.c:1230)
[ 2.997927][ T1] ? container_dev_init (drivers/base/container.c:39)
[ 2.997927][ T1] ? kernel_init_freeable (init/main.c:1327 init/main.c:1547)
[ 2.997927][ T1] ? kernel_init (init/main.c:1439)
[ 2.997927][ T1] ? ret_from_fork (arch/x86/kernel/process.c:153)
[ 2.997927][ T1] ? ret_from_fork_asm (arch/x86/entry/entry_64.S:312)
[ 2.997927][ T1] lookup_object_or_alloc (lib/debugobjects.c:308 lib/debugobjects.c:624)
[ 2.997927][ T1] __debug_object_init (lib/debugobjects.c:672)
[ 2.997927][ T1] pm_runtime_init (drivers/base/power/runtime.c:1731)
[ 2.997927][ T1] device_initialize (drivers/base/core.c:3102)
[ 2.997927][ T1] device_register (drivers/base/core.c:3706)
[ 2.997927][ T1] subsys_register (drivers/base/bus.c:1230)
[ 2.997927][ T1] container_dev_init (drivers/base/container.c:39)
[ 2.997927][ T1] kernel_init_freeable (init/main.c:1327 init/main.c:1547)
[ 2.997927][ T1] ? wait_for_completion (kernel/sched/completion.c:149)
[ 2.997927][ T1] ? rest_init (init/main.c:1429)
[ 2.997927][ T1] kernel_init (init/main.c:1439)
[ 2.997927][ T1] ret_from_fork (arch/x86/kernel/process.c:153)
[ 2.997927][ T1] ? rest_init (init/main.c:1429)
[ 2.997927][ T1] ret_from_fork_asm (arch/x86/entry/entry_64.S:312)
[ 2.997927][ T1] </TASK>
[ 3.000434][ T1] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[ 3.001966][ T1] futex hash table entries: 256 (order: 2, 24576 bytes, linear)



The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20231109/202311091331.f2eaf4f7-oliver.sang@xxxxxxxxx



--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki