Re: [PATCH 02/21] binder: fix use-after-free in shinker's callback

From: Liam R. Howlett
Date: Thu Nov 02 2023 - 15:21:33 EST


* Carlos Llamas <cmllamas@xxxxxxxxxx> [231102 15:00]:
> The mmap read lock is used during the shrinker's callback, which means
> that using alloc->vma pointer isn't safe as it can race with munmap().

I think you know my feelings about the safety of that pointer from
previous discussions.

> As of commit dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in
> munmap") the mmap lock is downgraded after the vma has been isolated.
>
> I was able to reproduce this issue by manually adding some delays and
> triggering page reclaiming through the shrinker's debug sysfs. The
> following KASAN report confirms the UAF:
>
> ==================================================================
> BUG: KASAN: slab-use-after-free in zap_page_range_single+0x470/0x4b8
> Read of size 8 at addr ffff356ed50e50f0 by task bash/478
>
> CPU: 1 PID: 478 Comm: bash Not tainted 6.6.0-rc5-00055-g1c8b86a3799f-dirty #70
> Hardware name: linux,dummy-virt (DT)
> Call trace:
> zap_page_range_single+0x470/0x4b8
> binder_alloc_free_page+0x608/0xadc
> __list_lru_walk_one+0x130/0x3b0
> list_lru_walk_node+0xc4/0x22c
> binder_shrink_scan+0x108/0x1dc
> shrinker_debugfs_scan_write+0x2b4/0x500
> full_proxy_write+0xd4/0x140
> vfs_write+0x1ac/0x758
> ksys_write+0xf0/0x1dc
> __arm64_sys_write+0x6c/0x9c
>
> Allocated by task 492:
> kmem_cache_alloc+0x130/0x368
> vm_area_alloc+0x2c/0x190
> mmap_region+0x258/0x18bc
> do_mmap+0x694/0xa60
> vm_mmap_pgoff+0x170/0x29c
> ksys_mmap_pgoff+0x290/0x3a0
> __arm64_sys_mmap+0xcc/0x144
>
> Freed by task 491:
> kmem_cache_free+0x17c/0x3c8
> vm_area_free_rcu_cb+0x74/0x98
> rcu_core+0xa38/0x26d4
> rcu_core_si+0x10/0x1c
> __do_softirq+0x2fc/0xd24
>
> Last potentially related work creation:
> __call_rcu_common.constprop.0+0x6c/0xba0
> call_rcu+0x10/0x1c
> vm_area_free+0x18/0x24
> remove_vma+0xe4/0x118
> do_vmi_align_munmap.isra.0+0x718/0xb5c
> do_vmi_munmap+0xdc/0x1fc
> __vm_munmap+0x10c/0x278
> __arm64_sys_munmap+0x58/0x7c
>
> Fix this issue by performing instead a vma_lookup() which will fail to
> find the vma that was isolated before the mmap lock downgrade. Note that
> this option has better performance than upgrading to a mmap write lock
> which would increase contention. Plus, mmap_write_trylock() has been
> recently removed anyway.
>
> Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap")
> Cc: stable@xxxxxxxxxxxxxxx
> Cc: Liam Howlett <liam.howlett@xxxxxxxxxx>
> Cc: Minchan Kim <minchan@xxxxxxxxxx>
> Signed-off-by: Carlos Llamas <cmllamas@xxxxxxxxxx>
> ---
> drivers/android/binder_alloc.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
> index e3db8297095a..c4d60d81221b 100644
> --- a/drivers/android/binder_alloc.c
> +++ b/drivers/android/binder_alloc.c
> @@ -1005,7 +1005,9 @@ enum lru_status binder_alloc_free_page(struct list_head *item,
> goto err_mmget;
> if (!mmap_read_trylock(mm))
> goto err_mmap_read_lock_failed;
> - vma = binder_alloc_get_vma(alloc);
> + vma = vma_lookup(mm, page_addr);
> + if (vma && vma != binder_alloc_get_vma(alloc))
> + goto err_invalid_vma;

Doesn't this need to be:
if (!vma || vma != binder_alloc_get_vma(alloc))

This way, we catch a different vma and a NULL vma.

Or even, just:
if (vma != binder_alloc_get_vma(alloc))

if the alloc vma cannot be NULL?

>
> list_lru_isolate(lru, item);
> spin_unlock(lock);
> @@ -1031,6 +1033,8 @@ enum lru_status binder_alloc_free_page(struct list_head *item,
> mutex_unlock(&alloc->mutex);
> return LRU_REMOVED_RETRY;
>
> +err_invalid_vma:
> + mmap_read_unlock(mm);
> err_mmap_read_lock_failed:
> mmput_async(mm);
> err_mmget:
> --
> 2.42.0.869.gea05f2083d-goog
>