[PATCH] net: page_pool: Fix NULL pointer dereference in page_pool_unlist()

From: Bert Karwatzki
Date: Thu Nov 30 2023 - 14:26:01 EST


When the the hlist_node pool->user.list is in an in an unhashed state,
calling hlist_del() leads to a NULL pointer dereference. This happens
e. g. when rmmod'ing the mt7921e (mediatek wifi driver) kernel module.
An additional check fixes the issue.

Fixes: 083772c9f972dc ("net: page_pool: record pools per netdev")

Signed-off-by: Bert Karwatzki <spasswolf@xxxxxx>
---
net/core/page_pool_user.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/core/page_pool_user.c b/net/core/page_pool_user.c
index 1426434a7e15..47d7d32288ab 100644
--- a/net/core/page_pool_user.c
+++ b/net/core/page_pool_user.c
@@ -339,7 +339,8 @@ void page_pool_unlist(struct page_pool *pool)
mutex_lock(&page_pools_lock);
netdev_nl_page_pool_event(pool, NETDEV_CMD_PAGE_POOL_DEL_NTF);
xa_erase(&page_pools, pool->user.id);
- hlist_del(&pool->user.list);
+ if(!hlist_unhashed(&pool->user.list))
+ hlist_del(&pool->user.list);
mutex_unlock(&page_pools_lock);
}

--
2.39.2

Since kernel version linux-next-20231129 I noticed that my MSI Alpha 15
laptop would hang on shutdown with blinking capslock, indicating a
kernel panic. I bisected the error to commit 083772c9f972dcc24891 an noticed that
error can also be triggered by rmmod'ing the mt7921e (mediatek wifi
driver) kernel module, giving the following backtrace (using the guess
unwinder):

BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 10 PID: 3053 Comm: modprobe Not tainted 6.7.0-rc3-next-20231130 #981
Hardware name: Micro-Star International Co., Ltd. Alpha 15 B5EEK/MS-158L, BIOS E158LAMS.107 11/10/2021
RIP: 0010:page_pool_unlist+0x41/0x80

RSP: 0018:ffffb9a9c5be3d78 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff9a1006944000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff9c8dc2c0
RBP: 0000000000000000 R08: 0000000000000006 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
R13: ffff9a10185147e8 R14: ffff9a1018512528 R15: 0000000000000000
FS: 00007f507e775040(0000) GS:ffff9a12de880000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000001872ba000 CR4: 0000000000750ef0
PKRU: 55555554
Call Trace:
<TASK>
? __die+0x1e/0x60
? page_fault_oops+0x157/0x430
? srso_alias_return_thunk+0x5/0xfbef5
? iommu_completion_wait.part.0.isra.0+0x78/0xf0
? exc_page_fault+0x5f/0x90
? asm_exc_page_fault+0x26/0x30
? page_pool_unlist+0x41/0x80
? page_pool_unlist+0x33/0x80
? page_pool_release+0x18a/0x1e0
? page_pool_destroy+0x95/0x150
? mt76_dma_cleanup+0x118/0x1f0 [mt76]
? mt7921_pci_remove+0xbf/0x130 [mt7921e]
? pci_device_remove+0x35/0xa0
? device_release_driver_internal+0x19a/0x200
? driver_detach+0x43/0x90
? bus_remove_driver+0x68/0xf0
? pci_unregister_driver+0x3a/0x80
? __do_sys_delete_module+0x1a8/0x2e0
? srso_alias_return_thunk+0x5/0xfbef5
? __fput+0x119/0x2b0
? do_syscall_64+0x45/0xf0
? entry_SYSCALL_64_after_hwframe+0x4b/0x53
</TASK>

The given patch addresses the issue by adding an extra check to
page_pool_unlist().

Bert Karwatzki