[PATCH] KASAN: use-after-free Read in rdma_listen

From: Tomas Bortoli
Date: Fri Jul 06 2018 - 21:41:43 EST


Hi,

I spent some time debugging the Syzkaller's found issue at subject:

https://syzkaller.appspot.com/bug?id=b8febdb3c7c8c1f1b606fb903cee66b21b2fd02f

And I've backtracked the UAF to the fact that the cma_listen_on_all()
function adds "id_priv->list" to the global var "listen_any_list" but
then such element is not removed in the rdma_destroy_id() function
(though I've seen that the call to cma_release_dev() in
rdma_destroy_id() should do the removal but doesn't get executed).

Therefore, if a program allocates a "struct rdma_cm_id" (through
ucma_open + ucma_create_id), then executes cma_listen_on_all(), then
frees the struct and repeat, during the second execution of
cma_listen_on_all() the kernel will try to update the references of the
freed node, triggering the UAF. I was able to fix the UAF with this ugly
patch:

--- b/drivers/infiniband/core/cma.cÂÂ Â2018-07-07 02:28:03.214589868 +0200
+++ a/drivers/infiniband/core/cma.cÂÂ Â2018-07-07 03:35:44.325301216 +0200
@@ -1678,6 +1678,11 @@ void rdma_destroy_id(struct rdma_cm_id *
ÂÂÂ Âmutex_lock(&id_priv->handler_mutex);
ÂÂÂ Âmutex_unlock(&id_priv->handler_mutex);
Â
+ÂÂ Âmutex_lock(&lock);
+ÂÂ Âif(id_priv->list.next!=0 && id_priv->list.prev!=0)
+ÂÂ ÂÂÂ Âlist_del(&id_priv->list);
+ÂÂ Âmutex_unlock(&lock);
+
ÂÂÂ Âif (id_priv->cma_dev) {
ÂÂÂ ÂÂÂ Ârdma_restrack_del(&id_priv->res);
ÂÂÂ ÂÂÂ Âif (rdma_cap_ib_cm(id_priv->id.device, 1)) {

Note: I only tested this patch against the shortest reproducer for this
issue (not any other use of rdma_cm):

https://syzkaller.appspot.com/text?tag=ReproC&x=1334f10f800000

I had to add that "if" in the patch because running the reproducer
(after several iterations) provoked a NULL-dereference in the added
list_del() call because for some reason I haven't cleared yet the next
and prev pointers of the list at issue gets zeroed, sometimes ( by what ??).


Moreover, I noticed that running the reproducer for "long" time exhaust
all the available memory. To spot the memory leaks I recompiled with:

CONFIG_HAVE_DEBUG_KMEMLEAK=y
CONFIG_DEBUG_KMEMLEAK=y
CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE=10000

The reproducer induces, apparently, 2 memory leaks reported by kmemleak:

unreferenced object 0xffff880069f49d40 (size 512):
 comm "repro", pid 4263, jiffies 4294722196 (age 688.262s)
 hex dump (first 32 bytes):
 00 b8 13 5a 00 88 ff ff 40 9d f4 69 00 88 ff ff ...Z....@..i....
ÂÂÂ 0a 00 98 a6 00 00 00 00 fe 80 00 00 00 00 00 00Â ................
 backtrace:
ÂÂÂ [<0000000075a2f334>] kmem_cache_alloc_trace+0x1b2/0x3d0
ÂÂÂ [<0000000075fd9fea>] rdma_resolve_ip+0xc0/0x6b0
ÂÂÂ [<0000000033592b0b>] rdma_resolve_addr+0x490/0x2580
ÂÂÂ [<00000000d6f2cd9d>] ucma_resolve_ip+0x193/0x260
ÂÂÂ [<0000000068f1c2b7>] ucma_write+0x2ec/0x3f0
ÂÂÂ [<00000000015692cc>] __vfs_write+0x107/0x920
ÂÂÂ [<000000009528b010>] vfs_write+0x189/0x510
ÂÂÂ [<000000001a5d169b>] ksys_write+0xfa/0x240
ÂÂÂ [<00000000b747746a>] __x64_sys_write+0x73/0xb0
ÂÂÂ [<0000000071590ffb>] do_syscall_64+0x18c/0x760
ÂÂÂ [<000000003c31113f>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
ÂÂÂ [<0000000059247e9d>] 0xffffffffffffffff


unreferenced object 0xffff88006c0c0bc0 (size 576):
 comm "repro", pid 4261, jiffies 4294722191 (age 688.261s)
 hex dump (first 32 bytes):
 00 02 00 00 00 00 00 00 80 b8 07 6c 00 88 ff ff ...........l....
 b0 7d 2c 6b 00 88 ff ff d8 0b 0c 6c 00 88 ff ff .},k.......l....
 backtrace:
ÂÂÂ [<0000000039511ef2>] kmem_cache_alloc+0x1b2/0x3d0
ÂÂÂ [<00000000106bf668>] radix_tree_node_alloc.constprop.18+0x5e/0x2e0
ÂÂÂ [<000000005b2f026d>] idr_get_free+0x9f5/0x1000
ÂÂÂ [<00000000445baa5a>] idr_alloc_u32+0x1bc/0x3d0
ÂÂÂ [<000000007fd1b6f4>] idr_alloc+0xfd/0x190
ÂÂÂ [<00000000d706389e>] cma_alloc_port+0xb0/0x170
ÂÂÂ [<000000008f968f9e>] rdma_bind_addr+0x1252/0x1f00
ÂÂÂ [<00000000e3361215>] rdma_resolve_addr+0x39e/0x2580
ÂÂÂ [<00000000d6f2cd9d>] ucma_resolve_ip+0x193/0x260
ÂÂÂ [<0000000068f1c2b7>] ucma_write+0x2ec/0x3f0
ÂÂÂ [<00000000015692cc>] __vfs_write+0x107/0x920
ÂÂÂ [<000000009528b010>] vfs_write+0x189/0x510
ÂÂÂ [<000000001a5d169b>] ksys_write+0xfa/0x240
ÂÂÂ [<00000000b747746a>] __x64_sys_write+0x73/0xb0
ÂÂÂ [<0000000071590ffb>] do_syscall_64+0x18c/0x760
ÂÂÂ [<000000003c31113f>] entry_SYSCALL_64_after_hwframe+0x49/0xbe

I don't have a background on usage or internals of the driver at issue
but I hope these clues will help in finding the proper fix.

Tomas