Re: [PATCH v12 00/25] Linux RISC-V AIA Support

From: Andrea Parri
Date: Thu Feb 08 2024 - 05:11:32 EST


Hi Anup,

I understand that some refactoring is in progress, but I don't see the
report below; adding it here hoping that it can be useful towards v13.
(Unfortunately, I didn't have enough time to debug this yet...)


> irqchip/sifive-plic: Convert PLIC driver into a platform driver

I'm seeing the following LOCKDEP warning with this series, bisected to
the commit above. This is a defconfig + PROVE_LOCKING=y build, booted
using -machine virt,aia=none.

[ 0.953473] ========================================================
[ 0.953704] WARNING: possible irq lock inversion dependency detected
[ 0.953955] 6.8.0-rc1-00039-gd9b9d6eb987f #1122 Not tainted
[ 0.954224] --------------------------------------------------------
[ 0.954444] swapper/0/0 just changed the state of lock:
[ 0.954664] ffffaf808109d0c8 (&irq_desc_lock_class){-...}-{2:2}, at: handle_fasteoi_irq+0x24/0x1da
[ 0.955699] but this lock took another, HARDIRQ-unsafe lock in the past:
[ 0.955942] (&handler->enable_lock){+.+.}-{2:2}
[ 0.955974]

and interrupts could create inverse lock ordering between them.

[ 0.956507]
other info that might help us debug this:
[ 0.956775] Possible interrupt unsafe locking scenario:

[ 0.956998] CPU0 CPU1
[ 0.957247] ---- ----
[ 0.957439] lock(&handler->enable_lock);
[ 0.957607] local_irq_disable();
[ 0.957793] lock(&irq_desc_lock_class);
[ 0.958021] lock(&handler->enable_lock);
[ 0.958246] <Interrupt>
[ 0.958342] lock(&irq_desc_lock_class);
[ 0.958501]
*** DEADLOCK ***

[ 0.958715] no locks held by swapper/0/0.
[ 0.958870]
the shortest dependencies between 2nd lock and 1st lock:
[ 0.959152] -> (&handler->enable_lock){+.+.}-{2:2} {
[ 0.959372] HARDIRQ-ON-W at:
[ 0.959522] __lock_acquire+0x884/0x1f5c
[ 0.959745] lock_acquire+0xf0/0x292
[ 0.959913] _raw_spin_lock+0x2c/0x40
[ 0.960090] plic_probe+0x322/0x65c
[ 0.960257] platform_probe+0x4e/0x92
[ 0.960432] really_probe+0x82/0x210
[ 0.960598] __driver_probe_device+0x5c/0xd0
[ 0.960784] driver_probe_device+0x2c/0xb0
[ 0.960964] __driver_attach+0x72/0x10a
[ 0.961151] bus_for_each_dev+0x60/0xac
[ 0.961330] driver_attach+0x1a/0x22
[ 0.961496] bus_add_driver+0xd4/0x19e
[ 0.961666] driver_register+0x3e/0xd8
[ 0.961835] __platform_driver_register+0x1c/0x24
[ 0.962030] plic_driver_init+0x1a/0x22
[ 0.962201] do_one_initcall+0x80/0x268
[ 0.962371] kernel_init_freeable+0x296/0x300
[ 0.962554] kernel_init+0x1e/0x10a
[ 0.962713] ret_from_fork+0xe/0x1c
[ 0.962884] SOFTIRQ-ON-W at:
[ 0.962994] __lock_acquire+0x89e/0x1f5c
[ 0.963169] lock_acquire+0xf0/0x292
[ 0.963336] _raw_spin_lock+0x2c/0x40
[ 0.963497] plic_probe+0x322/0x65c
[ 0.963664] platform_probe+0x4e/0x92
[ 0.963849] really_probe+0x82/0x210
[ 0.964054] __driver_probe_device+0x5c/0xd0
[ 0.964255] driver_probe_device+0x2c/0xb0
[ 0.964428] __driver_attach+0x72/0x10a
[ 0.964603] bus_for_each_dev+0x60/0xac
[ 0.964777] driver_attach+0x1a/0x22
[ 0.964943] bus_add_driver+0xd4/0x19e
[ 0.965343] driver_register+0x3e/0xd8
[ 0.965527] __platform_driver_register+0x1c/0x24
[ 0.965732] plic_driver_init+0x1a/0x22
[ 0.965908] do_one_initcall+0x80/0x268
[ 0.966078] kernel_init_freeable+0x296/0x300
[ 0.966268] kernel_init+0x1e/0x10a
[ 0.966436] ret_from_fork+0xe/0x1c
[ 0.966599] INITIAL USE at:
[ 0.966716] __lock_acquire+0x3fc/0x1f5c
[ 0.966891] lock_acquire+0xf0/0x292
[ 0.967048] _raw_spin_lock+0x2c/0x40
[ 0.967206] plic_probe+0x322/0x65c
[ 0.967360] platform_probe+0x4e/0x92
[ 0.967522] really_probe+0x82/0x210
[ 0.967678] __driver_probe_device+0x5c/0xd0
[ 0.967853] driver_probe_device+0x2c/0xb0
[ 0.968025] __driver_attach+0x72/0x10a
[ 0.968185] bus_for_each_dev+0x60/0xac
[ 0.968348] driver_attach+0x1a/0x22
[ 0.968513] bus_add_driver+0xd4/0x19e
[ 0.968678] driver_register+0x3e/0xd8
[ 0.968839] __platform_driver_register+0x1c/0x24
[ 0.969035] plic_driver_init+0x1a/0x22
[ 0.969239] do_one_initcall+0x80/0x268
[ 0.969431] kernel_init_freeable+0x296/0x300
[ 0.969610] kernel_init+0x1e/0x10a
[ 0.969766] ret_from_fork+0xe/0x1c
[ 0.969936] }
[ 0.970010] ... key at: [<ffffffff824f4138>] __key.2+0x0/0x10
[ 0.970224] ... acquired at:
[ 0.970353] lock_acquire+0xf0/0x292
[ 0.970482] _raw_spin_lock+0x2c/0x40
[ 0.970609] plic_irq_enable+0x7e/0x140
[ 0.970739] irq_enable+0x2c/0x60
[ 0.970882] __irq_startup+0x58/0x60
[ 0.971008] irq_startup+0x5e/0x13c
[ 0.971126] __setup_irq+0x4de/0x5da
[ 0.971248] request_threaded_irq+0xcc/0x12e
[ 0.971394] vm_find_vqs+0x62/0x50a
[ 0.971518] probe_common+0xfe/0x1d2
[ 0.971635] virtrng_probe+0xc/0x14
[ 0.971751] virtio_dev_probe+0x154/0x1fc
[ 0.971878] really_probe+0x82/0x210
[ 0.972008] __driver_probe_device+0x5c/0xd0
[ 0.972147] driver_probe_device+0x2c/0xb0
[ 0.972280] __driver_attach+0x72/0x10a
[ 0.972407] bus_for_each_dev+0x60/0xac
[ 0.972540] driver_attach+0x1a/0x22
[ 0.972656] bus_add_driver+0xd4/0x19e
[ 0.972777] driver_register+0x3e/0xd8
[ 0.972896] register_virtio_driver+0x1c/0x2a
[ 0.973049] virtio_rng_driver_init+0x18/0x20
[ 0.973236] do_one_initcall+0x80/0x268
[ 0.973399] kernel_init_freeable+0x296/0x300
[ 0.973540] kernel_init+0x1e/0x10a
[ 0.973658] ret_from_fork+0xe/0x1c

[ 0.973858] -> (&irq_desc_lock_class){-...}-{2:2} {
[ 0.974036] IN-HARDIRQ-W at:
[ 0.974142] __lock_acquire+0xa82/0x1f5c
[ 0.974309] lock_acquire+0xf0/0x292
[ 0.974467] _raw_spin_lock+0x2c/0x40
[ 0.974625] handle_fasteoi_irq+0x24/0x1da
[ 0.974794] generic_handle_domain_irq+0x1c/0x2a
[ 0.974982] plic_handle_irq+0x7e/0xf0
[ 0.975143] generic_handle_domain_irq+0x1c/0x2a
[ 0.975329] riscv_intc_irq+0x2e/0x46
[ 0.975488] handle_riscv_irq+0x4a/0x74
[ 0.975652] call_on_irq_stack+0x32/0x40
[ 0.975817] INITIAL USE at:
[ 0.975923] __lock_acquire+0x3fc/0x1f5c
[ 0.976087] lock_acquire+0xf0/0x292
[ 0.976244] _raw_spin_lock_irqsave+0x3a/0x64
[ 0.976423] __irq_get_desc_lock+0x5a/0x84
[ 0.976594] irq_modify_status+0x2a/0x106
[ 0.976764] irq_set_percpu_devid+0x62/0x78
[ 0.976939] riscv_intc_domain_map+0x1e/0x54
[ 0.977133] irq_domain_associate_locked+0x42/0xe4
[ 0.977363] irq_create_mapping_affinity+0x98/0xd4
[ 0.977570] sbi_ipi_init+0x70/0x142
[ 0.977744] init_IRQ+0xfe/0x11a
[ 0.977906] start_kernel+0x4ae/0x790
[ 0.978082] }
[ 0.978151] ... key at: [<ffffffff824bbee0>] irq_desc_lock_class+0x0/0x10
[ 0.978389] ... acquired at:
[ 0.978494] mark_lock+0x3fe/0x8c2
[ 0.978624] __lock_acquire+0xa82/0x1f5c
[ 0.978766] lock_acquire+0xf0/0x292
[ 0.978897] _raw_spin_lock+0x2c/0x40
[ 0.979029] handle_fasteoi_irq+0x24/0x1da
[ 0.979171] generic_handle_domain_irq+0x1c/0x2a
[ 0.979326] plic_handle_irq+0x7e/0xf0
[ 0.979460] generic_handle_domain_irq+0x1c/0x2a
[ 0.979618] riscv_intc_irq+0x2e/0x46
[ 0.979751] handle_riscv_irq+0x4a/0x74
[ 0.979888] call_on_irq_stack+0x32/0x40

[ 0.980110]
stack backtrace:
[ 0.980358] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.8.0-rc1-00039-gd9b9d6eb987f #1122
[ 0.980662] Hardware name: riscv-virtio,qemu (DT)
[ 0.980913] Call Trace:
[ 0.981042] [<ffffffff80007198>] dump_backtrace+0x1c/0x24
[ 0.981246] [<ffffffff80ae020a>] show_stack+0x2c/0x38
[ 0.981456] [<ffffffff80aedac4>] dump_stack_lvl+0x5a/0x7c
[ 0.981648] [<ffffffff80aedafa>] dump_stack+0x14/0x1c
[ 0.981813] [<ffffffff80ae17a4>] print_irq_inversion_bug.part.0+0x162/0x176
[ 0.982031] [<ffffffff8007c6e6>] mark_lock+0x3fe/0x8c2
[ 0.982198] [<ffffffff8007d888>] __lock_acquire+0xa82/0x1f5c
[ 0.982377] [<ffffffff8007f59e>] lock_acquire+0xf0/0x292
[ 0.982549] [<ffffffff80af9962>] _raw_spin_lock+0x2c/0x40
[ 0.982721] [<ffffffff8008f3fe>] handle_fasteoi_irq+0x24/0x1da
[ 0.982904] [<ffffffff8008a4a4>] generic_handle_domain_irq+0x1c/0x2a
[ 0.983112] [<ffffffff80581dc0>] plic_handle_irq+0x7e/0xf0
[ 0.983293] [<ffffffff8008a4a4>] generic_handle_domain_irq+0x1c/0x2a
[ 0.983495] [<ffffffff8057fb1a>] riscv_intc_irq+0x2e/0x46
[ 0.983671] [<ffffffff80aedb4c>] handle_riscv_irq+0x4a/0x74
[ 0.983856] [<ffffffff80afa756>] call_on_irq_stack+0x32/0x40


When I switch to -machine virt,aia=aplic-imsic (same config as above), I
get:

[ 0.971406] ============================================
[ 0.971439] WARNING: possible recursive locking detected
[ 0.971497] 6.8.0-rc1-00039-gd9b9d6eb987f #1122 Not tainted
[ 0.971583] --------------------------------------------
[ 0.971612] swapper/0/1 is trying to acquire lock:
[ 0.971662] ffffaf83fefa8e78 (&lpriv->ids_lock){-...}-{2:2}, at: imsic_vector_move+0x92/0x146
[ 0.971927]
but task is already holding lock:
[ 0.971975] ffffaf83fef6ee78 (&lpriv->ids_lock){-...}-{2:2}, at: imsic_vector_move+0x86/0x146
[ 0.972045]
other info that might help us debug this:
[ 0.972085] Possible unsafe locking scenario:

[ 0.972114] CPU0
[ 0.972133] ----
[ 0.972153] lock(&lpriv->ids_lock);
[ 0.972191] lock(&lpriv->ids_lock);
[ 0.972228]
*** DEADLOCK ***

[ 0.972258] May be due to missing lock nesting notation

[ 0.972306] 6 locks held by swapper/0/1:
[ 0.972338] #0: ffffaf8081f65970 (&dev->mutex){....}-{3:3}, at: __driver_attach+0x6a/0x10a
[ 0.972413] #1: ffffaf808217c240 (&desc->request_mutex){+.+.}-{3:3}, at: __setup_irq+0xa2/0x5da
[ 0.972492] #2: ffffaf808217c0c8 (&irq_desc_lock_class){....}-{2:2}, at: __setup_irq+0xbe/0x5da
[ 0.972555] #3: ffffffff81892ac0 (mask_lock){....}-{2:2}, at: irq_setup_affinity+0x38/0xc6
[ 0.972617] #4: ffffffff81892a80 (tmp_mask_lock){....}-{2:2}, at: irq_do_set_affinity+0x3a/0x164
[ 0.972681] #5: ffffaf83fef6ee78 (&lpriv->ids_lock){-...}-{2:2}, at: imsic_vector_move+0x86/0x146
[ 0.972753]
stack backtrace:
[ 0.972852] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 6.8.0-rc1-00039-gd9b9d6eb987f #1122
[ 0.972900] Hardware name: riscv-virtio,qemu (DT)
[ 0.972987] Call Trace:
[ 0.973019] [<ffffffff80007198>] dump_backtrace+0x1c/0x24
[ 0.973054] [<ffffffff80ae020a>] show_stack+0x2c/0x38
[ 0.973083] [<ffffffff80aedac4>] dump_stack_lvl+0x5a/0x7c
[ 0.973112] [<ffffffff80aedafa>] dump_stack+0x14/0x1c
[ 0.973139] [<ffffffff8007ad5e>] print_deadlock_bug+0x282/0x328
[ 0.973168] [<ffffffff8007e15c>] __lock_acquire+0x1356/0x1f5c
[ 0.973198] [<ffffffff8007f59e>] lock_acquire+0xf0/0x292
[ 0.973225] [<ffffffff80af9adc>] _raw_spin_lock_irqsave+0x3a/0x64
[ 0.973255] [<ffffffff80581210>] imsic_vector_move+0x92/0x146
[ 0.973285] [<ffffffff80581a04>] imsic_irq_set_affinity+0x8e/0xc6
[ 0.973315] [<ffffffff8008c86a>] irq_do_set_affinity+0x142/0x164
[ 0.973345] [<ffffffff8008cc22>] irq_setup_affinity+0x68/0xc6
[ 0.973374] [<ffffffff8008fa82>] irq_startup+0x72/0x13c
[ 0.973401] [<ffffffff8008d40c>] __setup_irq+0x4de/0x5da
[ 0.973430] [<ffffffff8008d5d4>] request_threaded_irq+0xcc/0x12e
[ 0.973460] [<ffffffff806346d8>] vp_find_vqs_msix+0x114/0x376
[ 0.973491] [<ffffffff80634970>] vp_find_vqs+0x36/0x136
[ 0.973518] [<ffffffff80633280>] vp_modern_find_vqs+0x16/0x4e
[ 0.973547] [<ffffffff80ab31f8>] p9_virtio_probe+0x8e/0x31c
[ 0.973576] [<ffffffff8062d982>] virtio_dev_probe+0x154/0x1fc
[ 0.973605] [<ffffffff80693738>] really_probe+0x82/0x210
[ 0.973632] [<ffffffff80693922>] __driver_probe_device+0x5c/0xd0
[ 0.973661] [<ffffffff806939c2>] driver_probe_device+0x2c/0xb0
[ 0.973690] [<ffffffff80693b46>] __driver_attach+0x72/0x10a
[ 0.973718] [<ffffffff8069191a>] bus_for_each_dev+0x60/0xac
[ 0.973746] [<ffffffff80693164>] driver_attach+0x1a/0x22
[ 0.973773] [<ffffffff80692ade>] bus_add_driver+0xd4/0x19e
[ 0.973801] [<ffffffff8069487e>] driver_register+0x3e/0xd8
[ 0.973829] [<ffffffff8062d1ce>] register_virtio_driver+0x1c/0x2a
[ 0.973858] [<ffffffff80c3da52>] p9_virtio_init+0x36/0x56
[ 0.973887] [<ffffffff800028fe>] do_one_initcall+0x80/0x268
[ 0.973915] [<ffffffff80c01144>] kernel_init_freeable+0x296/0x300
[ 0.973944] [<ffffffff80af05dc>] kernel_init+0x1e/0x10a
[ 0.973972] [<ffffffff80afa716>] ret_from_fork+0xe/0x1c


FWIW, the full Qemu command I used was as follows:

sudo /home/andrea/Downloads/qemu/build/qemu-system-riscv64 \
-append "root=/dev/root rw rootfstype=9p rootflags=version=9p2000.L,trans=virtio,cache=mmap,access=any raid=noautodetect security=none loglevel=7" \
-cpu rv64,sv57=off,svadu=off,svnapot=off \
-device virtio-net-device,netdev=net0 \
-device virtio-rng-device,rng=rng0 \
-device virtio-9p-pci,fsdev=root,mount_tag=/dev/root \
-fsdev local,id=root,path=/home/andrea/Downloads/jammy/,security_model=none \
-kernel /home/andrea/linux/arch/riscv/boot/Image \
-m 16G \
-machine virt,aia=<either "none" or "aplic-imsic"> \
-monitor telnet:127.0.0.1:55555,server,nowait \
-netdev user,id=net0,host=10.0.2.10,hostfwd=tcp::10022-:22 \
-nographic \
-object rng-random,filename=/dev/urandom,id=rng0 \
-serial mon:stdio \
-smp 5


Andrea