[PATCH RESEND v2] dm: Fix use-after-free in dm_cleanup_zoned_dev()

From: Kirill Tkhai
Date: Thu Feb 17 2022 - 05:13:29 EST



dm_cleanup_zoned_dev() uses queue, so it must be called
before blk_cleanup_disk() starts its killing:

blk_cleanup_disk->blk_cleanup_queue()->kobject_put()->blk_release_queue()->
->...RCU...->blk_free_queue_rcu()->kmem_cache_free()

Otherwise, RCU callback may be executed first,
and dm_cleanup_zoned_dev() touches freed memory:

BUG: KASAN: use-after-free in dm_cleanup_zoned_dev+0x33/0xd0
Read of size 8 at addr ffff88805ac6e430 by task dmsetup/681

CPU: 4 PID: 681 Comm: dmsetup Not tainted 5.17.0-rc2+ #6
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x57/0x7d
print_address_description.constprop.0+0x1f/0x150
? dm_cleanup_zoned_dev+0x33/0xd0
kasan_report.cold+0x7f/0x11b
? dm_cleanup_zoned_dev+0x33/0xd0
dm_cleanup_zoned_dev+0x33/0xd0
__dm_destroy+0x26a/0x400
? dm_blk_ioctl+0x230/0x230
? up_write+0xd8/0x270
dev_remove+0x156/0x1d0
ctl_ioctl+0x269/0x530
? table_clear+0x140/0x140
? lock_release+0xb2/0x750
? remove_all+0x40/0x40
? rcu_read_lock_sched_held+0x12/0x70
? lock_downgrade+0x3c0/0x3c0
? rcu_read_lock_sched_held+0x12/0x70
dm_ctl_ioctl+0xa/0x10
__x64_sys_ioctl+0xb9/0xf0
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fb6dfa95c27
Code: 00 00 00 48 8b 05 69 92 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 39 92 0c 00 f7 d8 64 89 01 48
RSP: 002b:00007fff882c6c28 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007fb6dfb73a8e RCX: 00007fb6dfa95c27
RDX: 00007fb6e01d7ca0 RSI: 00000000c138fd04 RDI: 0000000000000003
RBP: 00007fff882c6ce0 R08: 00007fb6dfbc3558 R09: 00007fff882c6a90
R10: 00007fb6dfbc28a2 R11: 0000000000000206 R12: 00007fb6dfbc28a2
R13: 00007fb6dfbc28a2 R14: 00007fb6dfbc28a2 R15: 00007fb6dfbc28a2
</TASK>

Allocated by task 673:
kasan_save_stack+0x1e/0x40
__kasan_slab_alloc+0x66/0x80
kmem_cache_alloc_node+0x1ca/0x460
blk_alloc_queue+0x33/0x4e0
__blk_alloc_disk+0x1b/0x60
dm_create+0x368/0xa20
dev_create+0xb9/0x170
ctl_ioctl+0x269/0x530
dm_ctl_ioctl+0xa/0x10
__x64_sys_ioctl+0xb9/0xf0
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae

Freed by task 0:
kasan_save_stack+0x1e/0x40
kasan_set_track+0x21/0x30
kasan_set_free_info+0x20/0x30
__kasan_slab_free+0xfb/0x130
slab_free_freelist_hook+0x7d/0x150
kmem_cache_free+0x13c/0x340
rcu_do_batch+0x2d9/0x820
rcu_core+0x3b8/0x570
__do_softirq+0x1c4/0x63d

Last potentially related work creation:
kasan_save_stack+0x1e/0x40
__kasan_record_aux_stack+0x96/0xa0
call_rcu+0xc4/0x8f0
kobject_put+0xd9/0x270
disk_release+0xee/0x120
device_release+0x59/0xf0
kobject_put+0xd9/0x270
cleanup_mapped_device+0x12b/0x1b0
__dm_destroy+0x26a/0x400
dev_remove+0x156/0x1d0
ctl_ioctl+0x269/0x530
dm_ctl_ioctl+0xa/0x10
__x64_sys_ioctl+0xb9/0xf0
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae

The buggy address belongs to the object at ffff88805ac6e180
which belongs to the cache request_queue of size 2992
The buggy address is located 688 bytes inside of
2992-byte region [ffff88805ac6e180, ffff88805ac6ed30)
The buggy address belongs to the page:
page:000000000837df3c refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x5ac68
head:000000000837df3c order:3 compound_mapcount:0 compound_pincount:0
flags: 0xfffffc0010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
raw: 000fffffc0010200 0000000000000000 dead000000000122 ffff888001e58280
raw: 0000000000000000 00000000800a000a 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
ffff88805ac6e300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88805ac6e380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88805ac6e400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff88805ac6e480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88805ac6e500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

Fixes: bb37d77239af "dm: introduce zone append emulation"
Signed-off-by: Kirill Tkhai <ktkhai@xxxxxxxxxxxxx>
Reviewed-by: Damien Le Moal <damien.lemoal@xxxxxxxxxxxxxxxxxx>
---
v2: Split long commit message line and delete [xxx] time prefix from kernel output.

drivers/md/dm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index dcbd6d201619..d472fe5dbc1d 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1607,6 +1607,7 @@ static void cleanup_mapped_device(struct mapped_device *md)
md->dax_dev = NULL;
}

+ dm_cleanup_zoned_dev(md);
if (md->disk) {
spin_lock(&_minor_lock);
md->disk->private_data = NULL;
@@ -1627,7 +1628,6 @@ static void cleanup_mapped_device(struct mapped_device *md)
mutex_destroy(&md->swap_bios_lock);

dm_mq_cleanup_mapped_device(md);
- dm_cleanup_zoned_dev(md);
}

/*