Re: Possible race at user mode helper in request_firmware

From: Ming Lei
Date: Tue Mar 15 2016 - 22:06:46 EST


On Tue, Mar 15, 2016 at 11:09 PM, Ming Lei <ming.lei@xxxxxxxxxxxxx> wrote:
> Hi,
>
> On Mon, Mar 14, 2016 at 8:00 PM, <clingutla@xxxxxxxxxxxxxx> wrote:
>> Hello,
>>
>> I see possible race in _request_firmware_load function, on which I wanted to
>> take your opinion.
>>
>> When system is going to low power mode, device_cache_fw_images() is called
>> from pm notifier which schedules asynchronous workers to cache devices
>> firmware.
>> If more than 2 async requests falls under user helper mode, then there is a
>> use after free of "firmware" directory kobject.
>
> If two requests are for getting same firmware image, only one can be sent
> out and be completed with user helper, please see fw_lookup_and_allocate_buf(),
> so the race shouldn't have been triggered in this case.
>
> So I guess your test must be involved with at least two different requests.
>
>>
>> The race will hit, when one asynchronous worker calls device_del() while
>> other asynchronous worker calls get_device_parent() in
>> _request_firmware_load().
>> After loading firmware, first worker calls device_del(), and "firmware"
>> directory ref count is 1, so cleanup_glue_dir() calls kobject_put() with
>> gdp_mutex lock held.
>> Meanwhile, second async worker calls device_add(), and get_device_parent()
>> is still able to find the firmware directory kobject in ksets after getting
>> gdp_mutex.
>> This kobject gets added as parent to the second device kobject.
>> By the time of accessing the parent kobject, its ref count was 0 and prints
>> below warning stack followed by crash.
>
> The trick here is just that we don't use 'struct device' in
> device_cache_fw_images(),
> then the firmware device's parent is null during calling device_add() from
> _request_firmware_load(), but this should be allowed by driver core.
>
> From firmware class view, both device_add()/device_del() is run with pair,
> and driver core still complains during get_device_parent(), so something
> inside driver core should be be wrong, IMO.

Thinking about the issue further, I believe it is a false positive which is
caused by CONFIG_DEBUG_KOBJECT_RELEASE, follows the story:

1) device_add() path

get_device_parent()
......
mutex_lock(&gdp_mutex);

/* find our class-directory at the parent and reference it */
spin_lock(&dev->class->p->glue_dirs.list_lock);
list_for_each_entry(k, &dev->class->p->glue_dirs.list, entry)
if (k->parent == parent_kobj) {
kobj = kobject_get(k);
break;
}
spin_unlock(&dev->class->p->glue_dirs.list_lock);
if (kobj) {
mutex_unlock(&gdp_mutex);
return kobj;
}
k = class_dir_create_and_add(dev->class, parent_kobj);
mutex_unlock(&gdp_mutex);
......

2) device_del() path

cleanup_device_parent()
->cleanup_glue_dir()
->mutex_lock(&gdp_mutex);
->kobject_put(glue_dir);
->kobject_cleanup()
->kobject_del()
->kobj_kset_leave()
->mutex_unlock(&gdp_mutex);

So gdp_mutex is held wrt. both looking up/adding the parent kobject
and removing the parent kobject. Actually with this lock, both glur_dir
kobject's referece counting and joining/leaving kset are run atomically.

The race you found is triggered by CONFIG_DEBUG_KOBJECT_RELEASE,
which causes kobject_cleanup() to be run in workqueue context and the
lock of gdp_mutex can't be hold in that situation anymore. That is the root
cause for your race.

That also said DEBUG_KOBJECT_RELEASE might cause trouble in
some situations.

Thanks,
Ming

>
>>
>> [21147.455038] ------------[ cut here ]------------
>> [21147.458910] WARNING: at <...>/kernel/include/linux/kref.h:47
>> kobject_get+0x50/0x68()
>> [21147.481199] Modules linked in: core_ctl(PO) qdrbg_module(O)
>> qcrypto_module(O)
>> [21147.481256] CPU: 2 PID: 23935 Comm: kworker/u16:8 Tainted: P W O
>> 3.10.84-g0957845-00427-g56a05c2 #1
>> [21147.481284] Workqueue: events_unbound async_run_entry_fn
>> [21147.481296] Call trace:
>> [21147.481315] [<ffffffc000206b60>] dump_backtrace+0x0/0x270
>> [21147.481331] [<ffffffc000206de0>] show_stack+0x10/0x1c
>> [21147.481353] [<ffffffc000d569ec>] dump_stack+0x1c/0x28
>> [21147.481370] [<ffffffc00021cc18>] warn_slowpath_common+0x74/0x9c
>> [21147.481383] [<ffffffc00021ce4c>] warn_slowpath_null+0x14/0x20
>> [21147.481397] [<ffffffc000456b10>] kobject_get+0x4c/0x68
>> [21147.481416] [<ffffffc000609f64>] get_device_parent+0xa0/0x194
>> [21147.481430] [<ffffffc00060a3f4>] device_add+0x100/0x600
>> [21147.481446] [<ffffffc00061c390>] _request_firmware+0x8b4/0xa80
>> [21147.481459] [<ffffffc00061c58c>] request_firmware+0x30/0x3c
>> [21147.481473] [<ffffffc00061c5ec>] cache_firmware+0x54/0xb0
>> [21147.481490] [<ffffffc00061c65c>] __async_dev_cache_fw_image+0x14/0x54
>> [21147.481505] [<ffffffc000243794>] async_run_entry_fn+0x6c/0x12c
>> [21147.481521] [<ffffffc000237e88>] process_one_work+0x264/0x3dc
>> [21147.481535] [<ffffffc0002392c0>] worker_thread+0x1f0/0x340
>> [21147.481553] [<ffffffc00023e500>] kthread+0xac/0xb8
>> [21147.481563] ---[ end trace dabc98ea48b8ba59 ]---
>>
>> [21147.486436] Unable to handle kernel paging request at virtual address
>> 0x6b6b6b6b6b6b6bcb
>> [21147.493816] pgd = ffffffc0a982d000
>> [21147.498337] [6b6b6b6b6b6b6bcb] *pgd=0000000000000000
>> [21147.502287] Internal error: Oops: 96000004 [#1] PREEMPT SMP
>> [21147.507826] Modules linked in: core_ctl(PO) qdrbg_module(O)
>> qcrypto_module(O)
>> [21147.514951] CPU: 2 PID: 23935 Comm: kworker/u16:8 Tainted: P W O
>> 3.10.84-g0957845-00427-g56a05c2 #1
>> [21147.524686] Workqueue: events_unbound async_run_entry_fn
>> [21147.529965] task: ffffffc008138000 ti: ffffffc007880000 task.ti:
>> ffffffc007880000
>> [21147.537438] PC is at sysfs_create_dir+0x34/0xd4
>> [21147.541949] LR is at kobject_add_internal+0x120/0x24c
>> [21147.546979] pc : [<ffffffc000361fa8>] lr : [<ffffffc000456f5c>] pstate:
>> 80000145
>> [21147.996295] Process kworker/u16:8 (pid: 23935, stack limit =
>> 0xffffffc007880058)
>> [21148.003673] Call trace:
>> [21148.006117] [<ffffffc000361fa8>] sysfs_create_dir+0x34/0xd4
>> [21148.011663] [<ffffffc000456f58>] kobject_add_internal+0x11c/0x24c
>> [21148.017737] [<ffffffc0004573ec>] kobject_add+0xc8/0xe4
>> [21148.022863] [<ffffffc00060a410>] device_add+0x11c/0x600
>> [21148.028069] [<ffffffc00061c390>] _request_firmware+0x8b4/0xa80
>> [21148.033886] [<ffffffc00061c58c>] request_firmware+0x30/0x3c
>> [21148.039439] [<ffffffc00061c5ec>] cache_firmware+0x54/0xb0
>> [21148.044822] [<ffffffc00061c65c>] __async_dev_cache_fw_image+0x14/0x54
>> [21148.051249] [<ffffffc000243794>] async_run_entry_fn+0x6c/0x12c
>> [21148.057064] [<ffffffc000237e88>] process_one_work+0x264/0x3dc
>> [21148.062792] [<ffffffc0002392c0>] worker_thread+0x1f0/0x340
>> [21148.068263] [<ffffffc00023e500>] kthread+0xac/0xb8
>> [21148.073039] Code: b5000094 14000026 d000ac54 9125a294 (7940c281)
>> [21148.080312] ---[ end trace dabc98ea48b8ba5a ]---
>>
>> Below is the callstack where the parent is freed from the slub_debug info.
>> 0xFFFFFFC000608D74 \\vmlinux\base/core\class_dir_release+0x0C
>> 0xFFFFFFC0002FE6BC \\vmlinux\slub\free_debug_processing\fail+0x114
>> 0xFFFFFFC0002FE7A0 \\vmlinux\slub\__slab_free+0x44
>> 0xFFFFFFC0002FEE60 \\vmlinux\slub\kfree+0x1F8
>> 0xFFFFFFC000608D70 \\vmlinux\base/core\class_dir_release+0x8
>> 0xFFFFFFC000456D70 \\vmlinux\kobject\kobject_release+0x134
>> 0xFFFFFFC000456E10 \\vmlinux\kobject\kobject_put+0x58
>> 0xFFFFFFC000456C28 \\vmlinux\kobject\kobject_del+0x64
>> 0xFFFFFFC000609D78 \\vmlinux\base/core\device_del+0x150
>> 0xFFFFFFC00061C524 \\vmlinux\firmware_class\_request_firmware\out+0x71C
>> 0xFFFFFFC00061C58C \\vmlinux\firmware_class\request_firmware+0x30
>> 0xFFFFFFC00061C5EC \\vmlinux\firmware_class\cache_firmware+0x54
>> 0xFFFFFFC00061C65C \\vmlinux\firmware_class\__async_dev_cache_fw_image+0x14
>> 0xFFFFFFC000243794 \\vmlinux\async\async_run_entry_fn+0x6C
>> 0xFFFFFFC000237E88 \\vmlinux\workqueue\process_one_work+0x264
>> 0xFFFFFFC0002392C0 \\vmlinux\workqueue\worker_thread\recheck+0x1A0
>> 0xFFFFFFC00023E500 \\vmlinux\kthread\kthread+0xAC
>>
>> For similar type of issue, I see there is an existing fix: "sysfs: driver
>> core: Fix glue dir race condition by gdp_mutex".
>> https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/drivers/base/core.c?id=e4a60d139060975eb956717e4f63ae348d4d8cc5
>>
>> Still I am able to reproduce the issue. I have verified this race on
>> kernels: 3.10, 3.18 and 4.4.
>>
>> I followed below procedure to reproduce the issue:
>> 1. Enable "CONFIG_DEBUG_KOBJECT_RELEASE"
>> 2. Use test_firmware, modified the test for async calls
>> 3. Replace WARN_ON with BUG_ON in kref_get().
>> I ran below script from shell:
>> count=0
>> while [ 1 ]
>> do
>> echo 3 > /sys/devices/virtual/misc/test_firmware/trigger_ufw
>> if [ $? -ne 0 ]; then
>> print "Exiting.. "
>> exit 1
>> fi
>> count=$(($count +1))
>> echo "count $count"
>> done
>>
>> With below patch, I could reproduce the issue in the 2nd iteration.
>
> I will run your test script to see if there is new findings.
>
>>
>> -------------------------8<----------------------------------------------
>>
>> To reproduce the race condition, use firmware test and
>> CONFIG_DEBUG_KOBJECT_RELEASE=y, which delays kobject
>> releases. So that we can catch the bug easily.
>>
>> Enabled below flags in kernel config file:
>>
>> +CONFIG_DEBUG_KOBJECT_RELEASE=y
>> +CONFIG_TEST_FIRMWARE=y
>
> Looks I can't reproduce the issue with your approach.
>
>>
>>
>> diff --git a/include/linux/kref.h b/include/linux/kref.h
>> index 484604d..727fb24 100644
>> --- a/include/linux/kref.h
>> +++ b/include/linux/kref.h
>> @@ -44,7 +44,7 @@ static inline void kref_get(struct kref *kref)
>> * condition when this kref is freeing by some other thread right
>> now.
>> * In this case one should use kref_get_unless_zero()
>> */
>> - WARN_ON_ONCE(atomic_inc_return(&kref->refcount) < 2);
>> + BUG_ON(atomic_inc_return(&kref->refcount) < 2);
>> }
>>
>> /**
>> diff --git a/lib/test_firmware.c b/lib/test_firmware.c
>> old mode 100644
>> new mode 100755
>> index 86374c1..14c9598
>> --- a/lib/test_firmware.c
>> +++ b/lib/test_firmware.c
>> @@ -18,6 +18,7 @@
>> #include <linux/miscdevice.h>
>> #include <linux/slab.h>
>> #include <linux/uaccess.h>
>> +#include <linux/async.h>
>>
>> +
>> +static ASYNC_DOMAIN_EXCLUSIVE(ufw_domain);
>> +
>> +static void __async_ufw_work(void *name,
>> + async_cookie_t cookie)
>> +{
>> + const struct firmware *fw;
>> + const char *fw_name = name;
>> + int ret;
>> +
>> + pr_err("requesting %s\n", fw_name);
>> +
>> + ret = request_firmware(&fw, fw_name, NULL);
>> + pr_err("loaded: %zu\n", fw ? fw->size : 0);
>> + if (!ret)
>> + kfree(fw);
>> +
>> +}
>> +
>> +static char *name[] = { "fw1.bin", "fw2.bin"};
>> +
>> +static ssize_t trigger_ufw_store(struct device *dev,
>> + struct device_attribute *attr,
>> + const char *buf, size_t count)
>> +{
>> + int rc, i, c=0;
>> + u32 start;
>> + char *fw_name;
>> +
>> + rc = kstrtou32(buf, 0, &start);
>> + if (rc){
>> + pr_err("Invalid option\n");
>> + return rc;
>> + }
>> + pr_err(" no of iterations %d\n", start);
>> +
>> + for (i=0; i< start; i++)
>> + {
>> + if (c >= 2)
>> + c = 0;
>> + fw_name = name[c++];
>> + async_schedule_domain(__async_ufw_work, (void *)fw_name,
>> &ufw_domain);
>> + }
>> + async_synchronize_full_domain(&ufw_domain);
>> + return count;
>> +}
>>
>> static DEVICE_ATTR_WO(trigger_request);
>> +static DEVICE_ATTR_WO(trigger_ufw);
>>
>> static int __init test_firmware_init(void)
>> {
>> @@ -92,6 +140,12 @@ static int __init test_firmware_init(void)
>> goto dereg;
>> }
>>
>> + rc = device_create_file(test_fw_misc_device.this_device,
>> + &dev_attr_trigger_ufw);
>> + if (rc) {
>> + pr_err("could not create sysfs interface: %d\n", rc);
>> + goto dereg;
>> + }
>> pr_warn("interface ready\n");
>>
>> return 0;
>> @@ -106,7 +160,9 @@ static void __exit test_firmware_exit(void)
>> {
>> release_firmware(test_firmware);
>> device_remove_file(test_fw_misc_device.this_device,
>> - &dev_attr_trigger_request);
>> + &dev_attr_trigger_request);
>> + device_remove_file(test_fw_misc_device.this_device,
>> + &dev_attr_trigger_ufw);
>> misc_deregister(&test_fw_misc_device);
>> pr_warn("removed interface\n");
>> }
>>
>> ----------------------------8<--------------------------------------------
>>
>> Thanks and regards,
>> Chandrasekhar L.
>> --
>> QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of
>> Code Aurora Forum,
>> a Linux Foundation Collaborative Project.