Re: [syzbot] KASAN: use-after-free Read in rfkill_blocked

From: Krzysztof Kozlowski
Date: Mon Nov 28 2022 - 05:15:26 EST


On 28/11/2022 11:07, Dmitry Vyukov wrote:
> On Sun, 27 Nov 2022 at 20:59, Krzysztof Kozlowski
> <krzysztof.kozlowski@xxxxxxxxxx> wrote:
>>
>> On 25/11/2022 10:09, Johannes Berg wrote:
>>> Looks like an NFC issue to me, Krzysztof?
>>>
>>> I mean, rfkill got allocated by nfc_register_device(), freed by
>>> nfc_unregister_device(), and then used by nfc_dev_up(). Seems like the
>>> last bit shouldn't be possible after nfc_unregister_device()?
>>>
>>> johannes
>>>
>>> On Wed, 2022-11-23 at 22:24 -0800, syzbot wrote:
>>>> Hello,
>>>>
>>>> syzbot found the following issue on:
>>>>
>>>> HEAD commit: 0966d385830d riscv: Fix auipc+jalr relocation range checks
>>>> git tree: git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git fixes
>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=11196d0d880000
>>>> kernel config: https://syzkaller.appspot.com/x/.config?x=6295d67591064921
>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=0299462c067009827b2a
>>>> compiler: riscv64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
>>>> userspace arch: riscv64
>>>>
>>>> Unfortunately, I don't have any reproducer for this issue yet.
>>>>
>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>>> Reported-by: syzbot+0299462c067009827b2a@xxxxxxxxxxxxxxxxxxxxxxxxx
>>>>
>>>> ==================================================================
>>>> BUG: KASAN: use-after-free in __lock_acquire+0x8ee/0x333e kernel/locking/lockdep.c:4897
>>>> Read of size 8 at addr ffffaf8024249018 by task syz-executor.0/7946
>>>>
>>>> CPU: 0 PID: 7946 Comm: syz-executor.0 Not tainted 5.17.0-rc1-syzkaller-00002-g0966d385830d #0
>>>> Hardware name: riscv-virtio,qemu (DT)
>>>> Call Trace:
>>>> [<ffffffff8000a228>] dump_backtrace+0x2e/0x3c arch/riscv/kernel/stacktrace.c:113
>>>> [<ffffffff831668cc>] show_stack+0x34/0x40 arch/riscv/kernel/stacktrace.c:119
>>>> [<ffffffff831756ba>] __dump_stack lib/dump_stack.c:88 [inline]
>>>> [<ffffffff831756ba>] dump_stack_lvl+0xe4/0x150 lib/dump_stack.c:106
>>>> [<ffffffff8047479e>] print_address_description.constprop.0+0x2a/0x330 mm/kasan/report.c:255
>>>> [<ffffffff80474d4c>] __kasan_report mm/kasan/report.c:442 [inline]
>>>> [<ffffffff80474d4c>] kasan_report+0x184/0x1e0 mm/kasan/report.c:459
>>>> [<ffffffff80475b20>] check_region_inline mm/kasan/generic.c:183 [inline]
>>>> [<ffffffff80475b20>] __asan_load8+0x6e/0x96 mm/kasan/generic.c:256
>>>> [<ffffffff80112b70>] __lock_acquire+0x8ee/0x333e kernel/locking/lockdep.c:4897
>>>> [<ffffffff80116582>] lock_acquire.part.0+0x1d0/0x424 kernel/locking/lockdep.c:5639
>>>> [<ffffffff8011682a>] lock_acquire+0x54/0x6a kernel/locking/lockdep.c:5612
>>>> [<ffffffff831afa2c>] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
>>>> [<ffffffff831afa2c>] _raw_spin_lock_irqsave+0x3e/0x62 kernel/locking/spinlock.c:162
>>>> [<ffffffff83034f0a>] rfkill_blocked+0x22/0x62 net/rfkill/core.c:941
>>>> [<ffffffff830b8862>] nfc_dev_up+0x8e/0x26c net/nfc/core.c:102
>>>> [<ffffffff830bb742>] nfc_genl_dev_up+0x5e/0x8a net/nfc/netlink.c:770
>>>> [<ffffffff8296f9ae>] genl_family_rcv_msg_doit+0x19a/0x23c net/netlink/genetlink.c:731
>>>> [<ffffffff82970420>] genl_family_rcv_msg net/netlink/genetlink.c:775 [inline]
>>>> [<ffffffff82970420>] genl_rcv_msg+0x236/0x3ba net/netlink/genetlink.c:792
>>>> [<ffffffff8296ded2>] netlink_rcv_skb+0xf8/0x2be net/netlink/af_netlink.c:2494
>>>> [<ffffffff8296ecb2>] genl_rcv+0x36/0x4c net/netlink/genetlink.c:803
>>>> [<ffffffff8296cbcc>] netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
>>>> [<ffffffff8296cbcc>] netlink_unicast+0x40e/0x5fe net/netlink/af_netlink.c:1343
>>>> [<ffffffff8296d29c>] netlink_sendmsg+0x4e0/0x994 net/netlink/af_netlink.c:1919
>>>> [<ffffffff826d264e>] sock_sendmsg_nosec net/socket.c:705 [inline]
>>>> [<ffffffff826d264e>] sock_sendmsg+0xa0/0xc4 net/socket.c:725
>>>> [<ffffffff826d4dd4>] ____sys_sendmsg+0x46e/0x484 net/socket.c:2413
>>>> [<ffffffff826d8bca>] ___sys_sendmsg+0x16c/0x1f6 net/socket.c:2467
>>>> [<ffffffff826d8e78>] __sys_sendmsg+0xba/0x150 net/socket.c:2496
>>>> [<ffffffff826d8f3a>] __do_sys_sendmsg net/socket.c:2505 [inline]
>>>> [<ffffffff826d8f3a>] sys_sendmsg+0x2c/0x3a net/socket.c:2503
>>>> [<ffffffff80005716>] ret_from_syscall+0x0/0x2
>>>>
>>>> Allocated by task 7946:
>>>> stack_trace_save+0xa6/0xd8 kernel/stacktrace.c:122
>>>> kasan_save_stack+0x2c/0x58 mm/kasan/common.c:38
>>>> kasan_set_track mm/kasan/common.c:45 [inline]
>>>> set_alloc_info mm/kasan/common.c:436 [inline]
>>>> ____kasan_kmalloc mm/kasan/common.c:515 [inline]
>>>> ____kasan_kmalloc mm/kasan/common.c:474 [inline]
>>>> __kasan_kmalloc+0x80/0xb2 mm/kasan/common.c:524
>>>> kasan_kmalloc include/linux/kasan.h:270 [inline]
>>>> __kmalloc+0x190/0x318 mm/slub.c:4424
>>>> kmalloc include/linux/slab.h:586 [inline]
>>>> kzalloc include/linux/slab.h:715 [inline]
>>>> rfkill_alloc+0x96/0x1aa net/rfkill/core.c:983
>>>> nfc_register_device+0xe4/0x29e net/nfc/core.c:1129
>>>> nci_register_device+0x538/0x612 net/nfc/nci/core.c:1252
>>>> virtual_ncidev_open+0x82/0x12c drivers/nfc/virtual_ncidev.c:143
>>>> misc_open+0x272/0x2c8 drivers/char/misc.c:141
>>>> chrdev_open+0x1d4/0x478 fs/char_dev.c:414
>>>> do_dentry_open+0x2a4/0x7d4 fs/open.c:824
>>>> vfs_open+0x52/0x5e fs/open.c:959
>>>> do_open fs/namei.c:3476 [inline]
>>>> path_openat+0x12b6/0x189e fs/namei.c:3609
>>>> do_filp_open+0x10e/0x22a fs/namei.c:3636
>>>> do_sys_openat2+0x174/0x31e fs/open.c:1214
>>>> do_sys_open fs/open.c:1230 [inline]
>>>> __do_sys_openat fs/open.c:1246 [inline]
>>>> sys_openat+0xdc/0x164 fs/open.c:1241
>>>> ret_from_syscall+0x0/0x2
>>>>
>>>> Freed by task 7944:
>>>> stack_trace_save+0xa6/0xd8 kernel/stacktrace.c:122
>>>> kasan_save_stack+0x2c/0x58 mm/kasan/common.c:38
>>>> kasan_set_track+0x1a/0x26 mm/kasan/common.c:45
>>>> kasan_set_free_info+0x1e/0x3a mm/kasan/generic.c:370
>>>> ____kasan_slab_free mm/kasan/common.c:366 [inline]
>>>> ____kasan_slab_free+0x15e/0x180 mm/kasan/common.c:328
>>>> __kasan_slab_free+0x10/0x18 mm/kasan/common.c:374
>>>> kasan_slab_free include/linux/kasan.h:236 [inline]
>>>> slab_free_hook mm/slub.c:1728 [inline]
>>>> slab_free_freelist_hook+0x8e/0x1cc mm/slub.c:1754
>>>> slab_free mm/slub.c:3509 [inline]
>>>> kfree+0xe0/0x3e4 mm/slub.c:4562
>>>> rfkill_release+0x20/0x2a net/rfkill/core.c:831
>>>> device_release+0x66/0x148 drivers/base/core.c:2229
>>>> kobject_cleanup lib/kobject.c:705 [inline]
>>>> kobject_release lib/kobject.c:736 [inline]
>>>> kref_put include/linux/kref.h:65 [inline]
>>>> kobject_put+0x1bc/0x38e lib/kobject.c:753
>>>> put_device+0x28/0x3a drivers/base/core.c:3512
>>>> rfkill_destroy+0x2a/0x3c net/rfkill/core.c:1142
>>>> nfc_unregister_device+0xac/0x232 net/nfc/core.c:1167
>>>> nci_unregister_device+0x168/0x182 net/nfc/nci/core.c:1298
>>>> virtual_ncidev_close+0x9c/0xbc drivers/nfc/virtual_ncidev.c:163
>>
>> There were several issues found recently in virtual NCI driver, so this
>> might be one of them. There is no reproducer, though...
>
>
> Hi Krzysztof,
>
> Do you think it's related specifically to the virtual driver?

Both, although maybe not this particular issue. There were like five
separate reports last few days...

>
> I would assume it's a bug in the NCI core itself related to dynamic
> device destructions. This should affect e.g. USB devices as well.

Earlier this year there was a bigger fix for unregister path in NFC -
see commits:
da5c0f119203 (nfc_unregister_device+nfc_fw_download
ef27324e2c (nci_unregister_device+nci_cmd_work)
1b0e81416 (rfkill related)
and these pointed out inherent issues in locking/synchronization of NFC
core modules. I don't think we fixed all of the core issues, rather only
what was reported, so some specific scenarios.

> It's an issue only in the virtual driver. It means that the virtual
> driver uses the NCI core incorrectly, not the way all real drivers use
> it. If so the question is: what is the difference? We need to fix it.
> It's not useful to have unrealistic test drivers -- we both get false
> positives and don't get true positives.
>
> I think the issue may be localized from the KASAN report itself w/o a
> reproducer.
> Is there proper synchronization between
> nfc_unregister_device/rfkill_destroy and nfc_dev_up/rfkill_blocked?
> Something that prevents rfkill_blocked to be called after
> rfkill_destroy? If not, then that's the issue.

Mentioned 1b0e81416a tried to do this and that time I had impression fix
is correct. However it seems it is not... (or not enough)

Best regards,
Krzysztof