Re: [selftests] a37ddddd86: BUG:KASAN:use-after-free_in_firmware_upload_unregister

From: Russ Weight
Date: Thu Sep 01 2022 - 12:20:25 EST


Thanks for the reply. I was able to reproduce it a couple of weeks ago and
I have already submitted a fix.

- Russ

On 8/31/22 22:43, Yujie Liu wrote:
> Hi Russ,
>
> On 8/2/2022 04:42, Russ Weight wrote:
>> Oliver,
>>
>> On 7/29/22 00:08, kernel test robot wrote:
>>>
>>> Greeting,
>>>
>>> FYI, we noticed the following commit (built with gcc-11):
>>>
>>> commit: a37ddddd86037c896c702b4df416bc4e51b2a5a0 ("selftests: firmware: Add firmware upload selftests")
>>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>>>
>>> in testcase: kernel-selftests
>>> version: kernel-selftests-x86_64-4cb0bec3-1_20220724
>>> with following parameters:
>>>
>>>     group: firmware
>>>     ucode: 0xec
>>>
>>> test-description: The kernel contains a set of "self tests" under the tools/testing/selftests/ directory. These are intended to be small unit tests to exercise individual code paths in the kernel.
>>> test-url: https://www.kernel.org/doc/Documentation/kselftest.txt
>>>
>>>
>>> on test machine: 8 threads Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz with 28G memory
>>>
>>> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
>>>
>>>
>>>
>>> If you fix the issue, kindly add following tag
>>> Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
>>>
>>>
>>> [ 103.520572][ T2443] BUG: KASAN: use-after-free in firmware_upload_unregister (drivers/base/firmware_loader/sysfs_upload.c:395)
>>> [  103.520579][ T2443] Read of size 8 at addr ffff8881e186c808 by task fw_upload.sh/2443
>>> [  103.528481][  T395]
>>> [  103.534696][ T2443]
>>> [  103.534698][ T2443] CPU: 7 PID: 2443 Comm: fw_upload.sh Not tainted 5.18.0-rc2-00036-ga37ddddd8603 #1
>>> [  103.534701][ T2443] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.2.8 01/26/2016
>>> [  103.534703][ T2443] Call Trace:
>>> [  103.534705][ T2443]  <TASK>
>>> [ 103.534707][ T2443] ? firmware_upload_unregister (drivers/base/firmware_loader/sysfs_upload.c:395)
>> I believe I understand the problem, but I have been unable to reproduce the error to verify the fix:
>>
>> 394         device_unregister(&fw_sysfs->dev);
>> 395         module_put(fw_upload_priv->module);
>>
>> The device_unregister() call could result in the dev_release
>> function freeing the fw_upload_priv structure before it is
>> dereferenced on line 395. Copying fw_upload_priv->module to a
>> local variable for use when calling device_unregister()
>> should fix the problem.
>>
>>> [ 103.534713][ T2443] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 4))
>>> [ 103.588011][ T2443] print_address_description+0x1f/0x200
>>> [ 103.594406][ T2443] ? firmware_upload_unregister (drivers/base/firmware_loader/sysfs_upload.c:395)
>>> [ 103.600112][ T2443] print_report.cold (mm/kasan/report.c:430)
>>> [ 103.604782][ T2443] ? do_raw_spin_lock (arch/x86/include/asm/atomic.h:202 include/linux/atomic/atomic-instrumented.h:543 include/asm-generic/qspinlock.h:82 kernel/locking/spinlock_debug.c:115)
>>> [ 103.609624][ T2443] kasan_report (mm/kasan/report.c:162 mm/kasan/report.c:493)
>>> [ 103.613861][ T2443] ? firmware_upload_unregister (drivers/base/firmware_loader/sysfs_upload.c:395)
>>> [ 103.619561][ T2443] firmware_upload_unregister (drivers/base/firmware_loader/sysfs_upload.c:395)
>>> [ 103.625091][ T2443] upload_unregister_store (lib/test_firmware.c:1060 lib/test_firmware.c:1321)
>>> [ 103.630377][ T2443] ? sysfs_file_ops (fs/sysfs/file.c:129)
>>> [ 103.635046][ T2443] kernfs_fop_write_iter (fs/kernfs/file.c:294)
>>> [ 103.640145][ T2443] new_sync_write (fs/read_write.c:505 (discriminator 1))
>>> [ 103.644642][ T2443] ? new_sync_read (fs/read_write.c:494)
>>> [ 103.649225][ T2443] ? ksys_write (fs/read_write.c:644)
>>> [ 103.653463][ T2443] ? rcu_read_unlock (include/linux/rcupdate.h:723 (discriminator 5))
>>> [ 103.658057][ T2443] ? lock_is_held_type (kernel/locking/lockdep.c:5382 kernel/locking/lockdep.c:5684)
>>> [ 103.662909][ T2443] vfs_write (fs/read_write.c:591)
>>> [ 103.666984][ T2443] ksys_write (fs/read_write.c:644)
>>> [ 103.671057][ T2443] ? __ia32_sys_read (fs/read_write.c:634)
>>> [ 103.675645][ T2443] ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:4501)
>>> [ 103.682051][ T2443] ? syscall_enter_from_user_mode (arch/x86/include/asm/irqflags.h:45 arch/x86/include/asm/irqflags.h:80 kernel/entry/common.c:109)
>>> [ 103.687756][ T2443] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
>>> [ 103.692010][ T2443] ? pick_file (fs/file.c:660)
>>> [ 103.696165][ T2443] ? do_raw_spin_unlock (arch/x86/include/asm/atomic.h:29 include/linux/atomic/atomic-instrumented.h:28 include/asm-generic/qspinlock.h:28 kernel/locking/spinlock_debug.c:100 kernel/locking/spinlock_debug.c:140)
>>> [ 103.701094][ T2443] ? syscall_exit_to_user_mode (kernel/entry/common.c:129 kernel/entry/common.c:296)
>>> [ 103.706539][ T2443] ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:4501)
>>> [ 103.712929][ T2443] ? do_syscall_64 (arch/x86/entry/common.c:87)
>>> [ 103.717343][ T2443] ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:4501)
>>> [ 103.723747][ T2443] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:115)
>>> [  103.729451][ T2443] RIP: 0033:0x7f1020308f33
>>> [ 103.733709][ T2443] Code: 8b 15 61 ef 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 55 c3 0f 1f 40 00 48 83 ec 28 48 89 54 24 18
>>> All code
>>> ========
>>>     0:    8b 15 61 ef 0c 00        mov    0xcef61(%rip),%edx        # 0xcef67
>>>     6:    f7 d8                    neg    %eax
>>>     8:    64 89 02                 mov    %eax,%fs:(%rdx)
>>>     b:    48 c7 c0 ff ff ff ff     mov    $0xffffffffffffffff,%rax
>>>    12:    eb b7                    jmp    0xffffffffffffffcb
>>>    14:    0f 1f 00                 nopl   (%rax)
>>>    17:    64 8b 04 25 18 00 00     mov    %fs:0x18,%eax
>>>    1e:    00
>>>    1f:    85 c0                    test   %eax,%eax
>>>    21:    75 14                    jne    0x37
>>>    23:    b8 01 00 00 00           mov    $0x1,%eax
>>>    28:    0f 05                    syscall
>>>    2a:*    48 3d 00 f0 ff ff        cmp    $0xfffffffffffff000,%rax        <-- trapping instruction
>>>    30:    77 55                    ja     0x87
>>>    32:    c3                       retq
>>>    33:    0f 1f 40 00              nopl   0x0(%rax)
>>>    37:    48 83 ec 28              sub    $0x28,%rsp
>>>    3b:    48 89 54 24 18           mov    %rdx,0x18(%rsp)
>>>
>>> Code starting with the faulting instruction
>>> ===========================================
>>>     0:    48 3d 00 f0 ff ff        cmp    $0xfffffffffffff000,%rax
>>>     6:    77 55                    ja     0x5d
>>>     8:    c3                       retq
>>>     9:    0f 1f 40 00              nopl   0x0(%rax)
>>>     d:    48 83 ec 28              sub    $0x28,%rsp
>>>    11:    48 89 54 24 18           mov    %rdx,0x18(%rsp)
>>> [  103.753040][ T2443] RSP: 002b:00007fffe4075988 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
>>> [  103.761244][ T2443] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f1020308f33
>>> [  103.769013][ T2443] RDX: 0000000000000003 RSI: 00005582df845b80 RDI: 0000000000000001
>>> [  103.776791][ T2443] RBP: 00005582df845b80 R08: 00000000ffffffff R09: 0000000000000003
>>> [  103.784561][ T2443] R10: 00005582df833c80 R11: 0000000000000246 R12: 0000000000000003
>>> [  103.792328][ T2443] R13: 00007f10203d96a0 R14: 0000000000000003 R15: 00007f10203d98a0
>>> [  103.800100][ T2443]  </TASK>
>>> [  103.802957][ T2443]
>>> [  103.805125][ T2443] Allocated by task 2443:
>>> [ 103.809276][ T2443] kasan_save_stack (mm/kasan/common.c:39)
>>> [ 103.813781][ T2443] __kasan_kmalloc (mm/kasan/common.c:45 mm/kasan/common.c:436 mm/kasan/common.c:515 mm/kasan/common.c:524)
>>> [ 103.818190][ T2443] firmware_upload_register (drivers/base/firmware_loader/sysfs_upload.c:160)
>>> [ 103.824150][ T2443] upload_register_store (lib/test_firmware.c:1279)
>>> [ 103.829250][ T2443] kernfs_fop_write_iter (fs/kernfs/file.c:294)
>>> [ 103.834350][ T2443] new_sync_write (fs/read_write.c:505 (discriminator 1))
>>> [ 103.838846][ T2443] vfs_write (fs/read_write.c:591)
>>> [ 103.842910][ T2443] ksys_write (fs/read_write.c:644)
>>> [ 103.846975][ T2443] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
>>> [ 103.851217][ T2443] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:115)
>>> [  103.856932][ T2443]
>>> [  103.859100][ T2443] Freed by task 2443:
>>> [ 103.862907][ T2443] kasan_save_stack (mm/kasan/common.c:39)
>>> [ 103.867415][ T2443] kasan_set_track (mm/kasan/common.c:45)
>>> [ 103.871822][ T2443] kasan_set_free_info (mm/kasan/generic.c:372)
>>> [ 103.876579][ T2443] __kasan_slab_free (mm/kasan/common.c:368 mm/kasan/common.c:328 mm/kasan/common.c:374)
>>> [ 103.881331][ T2443] slab_free_freelist_hook (mm/slub.c:1754)
>>> [ 103.886517][ T2443] kfree (mm/slub.c:3510 mm/slub.c:4552)
>>> [ 103.890156][ T2443] fw_dev_release (drivers/base/firmware_loader/sysfs.c:102)
>>> [ 103.894483][ T2443] device_release (drivers/base/core.c:2235)
>>> [ 103.898902][ T2443] kobject_cleanup (lib/kobject.c:677)
>>> [ 103.903492][ T2443] firmware_upload_unregister (drivers/base/firmware_loader/sysfs_upload.c:395)
>>> [ 103.909034][ T2443] upload_unregister_store (lib/test_firmware.c:1060 lib/test_firmware.c:1321)
>>> [ 103.914311][ T2443] kernfs_fop_write_iter (fs/kernfs/file.c:294)
>>> [ 103.919429][ T2443] new_sync_write (fs/read_write.c:505 (discriminator 1))
>>> [ 103.923927][ T2443] vfs_write (fs/read_write.c:591)
>>> [ 103.927990][ T2443] ksys_write (fs/read_write.c:644)
>>> [ 103.932054][ T2443] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
>>> [ 103.936290][ T2443] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:115)
>>> [  103.941992][ T2443]
>>> [  103.944159][ T2443] Last potentially related work creation:
>>> [ 103.949704][ T2443] kasan_save_stack (mm/kasan/common.c:39)
>>> [ 103.954216][ T2443] __kasan_record_aux_stack (mm/kasan/generic.c:348)
>>> [ 103.959400][ T2443] insert_work (include/linux/instrumented.h:71 include/asm-generic/bitops/instrumented-non-atomic.h:134 kernel/workqueue.c:635 kernel/workqueue.c:642 kernel/workqueue.c:1361)
>>> [ 103.963552][ T2443] __queue_work (kernel/workqueue.c:1520)
>>> [ 103.967888][ T2443] queue_work_on (kernel/workqueue.c:1546)
>>> [ 103.972141][ T2443] fw_upload_start (drivers/base/firmware_loader/sysfs_upload.c:263)
>>> [ 103.976723][ T2443] firmware_loading_store (drivers/base/firmware_loader/sysfs.c:213)
>>> [ 103.981910][ T2443] kernfs_fop_write_iter (fs/kernfs/file.c:294)
>>> [ 103.987022][ T2443] new_sync_write (fs/read_write.c:505 (discriminator 1))
>>> [ 103.991537][ T2443] vfs_write (fs/read_write.c:591)
>>> [ 103.995604][ T2443] ksys_write (fs/read_write.c:644)
>>> [ 103.999673][ T2443] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
>>> [ 104.003930][ T2443] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:115)
>>> [  104.009631][ T2443]
>>> [  104.011800][ T2443] Second to last potentially related work creation:
>>> [ 104.018219][ T2443] kasan_save_stack (mm/kasan/common.c:39)
>>> [ 104.022727][ T2443] __kasan_record_aux_stack (mm/kasan/generic.c:348)
>>> [ 104.027938][ T2443] insert_work (include/linux/instrumented.h:71 include/asm-generic/bitops/instrumented-non-atomic.h:134 kernel/workqueue.c:635 kernel/workqueue.c:642 kernel/workqueue.c:1361)
>>> [ 104.032101][ T2443] __queue_work (kernel/workqueue.c:1520)
>>> [ 104.036423][ T2443] queue_work_on (kernel/workqueue.c:1546)
>>> [ 104.040658][ T2443] fw_upload_start (drivers/base/firmware_loader/sysfs_upload.c:263)
>>> [ 104.045240][ T2443] firmware_loading_store (drivers/base/firmware_loader/sysfs.c:213)
>>> [ 104.050423][ T2443] kernfs_fop_write_iter (fs/kernfs/file.c:294)
>>> [ 104.055522][ T2443] new_sync_write (fs/read_write.c:505 (discriminator 1))
>>> [ 104.060016][ T2443] vfs_write (fs/read_write.c:591)
>>> [ 104.064081][ T2443] ksys_write (fs/read_write.c:644)
>>> [ 104.068144][ T2443] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
>>> [ 104.072381][ T2443] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:115)
>>> [  104.078083][ T2443]
>>> [  104.080249][ T2443] The buggy address belongs to the object at ffff8881e186c800
>>> [  104.080249][ T2443]  which belongs to the cache kmalloc-512 of size 512
>>> [  104.094049][ T2443] The buggy address is located 8 bytes inside of
>>> [  104.094049][ T2443]  512-byte region [ffff8881e186c800, ffff8881e186ca00)
>>> [  104.106914][ T2443]
>>> [  104.109084][ T2443] The buggy address belongs to the physical page:
>>> [  104.115315][ T2443] page:0000000037a5888d refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1e1868
>>> [  104.125336][ T2443] head:0000000037a5888d order:3 compound_mapcount:0 compound_pincount:0
>>> [  104.133454][ T2443] flags: 0x17ffffc0010200(slab|head|node=0|zone=2|lastcpupid=0x1fffff)
>>> [  104.141498][ T2443] raw: 0017ffffc0010200 ffffea0005491e00 dead000000000002 ffff888100042c80
>>> [  104.149885][ T2443] raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
>>> [  104.158274][ T2443] page dumped because: kasan: bad access detected
>>> [  104.164492][ T2443]
>>>
>>>
>>> To reproduce:
>>>
>>>          git clone https://github.com/intel/lkp-tests.git
>>>          cd lkp-tests
>>>          sudo bin/lkp install job.yaml           # job file is attached in this email
>> I have tried these steps on Fedora35 and on CentOS Stream. In both
>> cases I have missing packages that I have not yet resolved:
>>
>> Error: Unable to find a match: arping lib32gcc-dev libc6-dev-i386 libc6-i386 libc6-x32 libhugetlbfs-dev libmnl-dev libmount-dev libpci3 libreadline-dev libx32asan5 libx32atomic1 libx32gcc1 libx32gcc-dev libx32gomp1 libx32itm1 libx32quadmath0 libx32ubsan1 linux-libc-dev-amd64-cross netcat-openbsd openvswitch-common openvswitch-switch sendip libpci-dev
>>
>> Simply running the fw_upload selftests in a loop is not sufficient to
>> trigger the error. Can you provide additional instructions for
>> running the lkp tests manually? Do I need a specific OS? How can I
>> access the missing packages?
>
> We can reproduce this error on bare metal by simply running fw_upload.sh.
>
> ~/linux/tools/testing/selftests/firmware# ./fw_upload.sh
> ./fw_upload.sh: firmware upload cancellation works
> ./fw_upload.sh: firmware upload error handling works
> ./fw_upload.sh: oversized firmware error handling works
> ./fw_upload.sh: firmware upload for fw1 works
> ./fw_upload.sh: firmware upload for fw2 works
> ./fw_upload.sh: firmware upload for fw3 works
>
> Message from syslogd@debian-x8664 at Sep  1 05:06:54 ...
>  kernel:[ 1090.872590][ T1293] Kernel panic - not syncing: Fatal exception
>
> dmesg read from serial:
>
> [ 1089.654274][ T1293] ==================================================================
> [ 1089.662192][ T1293] BUG: KASAN: use-after-free in firmware_upload_unregister+0x16e/0x1c0
> [ 1089.670282][ T1293] Read of size 8 at addr ffff88873a872008 by task fw_upload.sh/1293
> [ 1089.678107][ T1293]
> [ 1089.680291][ T1293] CPU: 4 PID: 1293 Comm: fw_upload.sh Not tainted 5.18.0-rc2-00036-ga37ddddd8603 #1
> [ 1089.689527][ T1293] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.2.8 01/26/2016
> [ 1089.697612][ T1293] Call Trace:
> [ 1089.700751][ T1293]  <TASK>
> [ 1089.703557][ T1293]  ? firmware_upload_unregister+0x16e/0x1c0
> [ 1089.709313][ T1293]  dump_stack_lvl+0x45/0x59
> [ 1089.713669][ T1293]  print_address_description.constprop.0+0x1f/0x200
> [ 1089.720107][ T1293]  ? firmware_upload_unregister+0x16e/0x1c0
> [ 1089.725851][ T1293]  print_report.cold+0x55/0x22c
> [ 1089.730556][ T1293]  ? do_raw_spin_lock+0x12e/0x280
> [ 1089.735432][ T1293]  kasan_report+0xbe/0x1c0
>
>
> Could you please help check if the .config file used to compile the kernel
> matches the one we attached in the original report? Here we attach it again
> for your reference.
>
>
> About the issue of missing packages during setting up lkp test environment,
> we wish to support various OS and distributions, but sometimes our package
> dependencies are not updated in time, sorry for the inconvenience. We use
> debian OS in our test environment, and it can install the required packages
> successfully. We will fix the package issue on other OS soon. For this case,
> we still recommend to run fw_upload.sh to trigger the error, because it's
> much easier than setting up lkp tests environment.
>
> --
> Thanks,
> Yujie
>
>>
>> Thanks,
>> - Russ
>>>          bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
>>>          sudo bin/lkp run generated-yaml-file
>>>
>>>          # if come across any failure that blocks the test,
>>>          # please remove ~/.lkp and /lkp dir to run from a clean state.
>>>
>>>
>>>
>>