Re: [PATCH v7 12/12] rpmsg: Fix kfree() of static memory on setting driver_override

From: Marek Szyprowski
Date: Fri Apr 29 2022 - 10:51:18 EST


On 29.04.2022 16:16, Krzysztof Kozlowski wrote:
> On 29/04/2022 14:29, Marek Szyprowski wrote:
>> On 19.04.2022 13:34, Krzysztof Kozlowski wrote:
>>> The driver_override field from platform driver should not be initialized
>>> from static memory (string literal) because the core later kfree() it,
>>> for example when driver_override is set via sysfs.
>>>
>>> Use dedicated helper to set driver_override properly.
>>>
>>> Fixes: 950a7388f02b ("rpmsg: Turn name service into a stand alone driver")
>>> Fixes: c0cdc19f84a4 ("rpmsg: Driver for user space endpoint interface")
>>> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@xxxxxxxxxx>
>>> Reviewed-by: Bjorn Andersson <bjorn.andersson@xxxxxxxxxx>
>> This patch landed recently in linux-next as commit 42cd402b8fd4 ("rpmsg:
>> Fix kfree() of static memory on setting driver_override"). In my tests I
>> found that it triggers the following issue during boot of the
>> DragonBoard410c SBC (arch/arm64/boot/dts/qcom/apq8016-sbc.dtb):
>>
>> ------------[ cut here ]------------
>> DEBUG_LOCKS_WARN_ON(lock->magic != lock)
>> WARNING: CPU: 1 PID: 8 at kernel/locking/mutex.c:582
>> __mutex_lock+0x1ec/0x430
>> Modules linked in:
>> CPU: 1 PID: 8 Comm: kworker/u8:0 Not tainted 5.18.0-rc4-next-20220429 #11815
>> Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
>> Workqueue: events_unbound deferred_probe_work_func
>> pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>> pc : __mutex_lock+0x1ec/0x430
>> lr : __mutex_lock+0x1ec/0x430
>> ..
>> Call trace:
>>  __mutex_lock+0x1ec/0x430
>>  mutex_lock_nested+0x38/0x64
>>  driver_set_override+0x124/0x150
>>  qcom_smd_register_edge+0x2a8/0x4ec
>>  qcom_smd_probe+0x54/0x80
>>  platform_probe+0x68/0xe0
>>  really_probe.part.0+0x9c/0x29c
>>  __driver_probe_device+0x98/0x144
>>  driver_probe_device+0xac/0x14c
>>  __device_attach_driver+0xb8/0x120
>>  bus_for_each_drv+0x78/0xd0
>>  __device_attach+0xd8/0x180
>>  device_initial_probe+0x14/0x20
>>  bus_probe_device+0x9c/0xa4
>>  deferred_probe_work_func+0x88/0xc4
>>  process_one_work+0x288/0x6bc
>>  worker_thread+0x248/0x450
>>  kthread+0x118/0x11c
>>  ret_from_fork+0x10/0x20
>> irq event stamp: 3599
>> hardirqs last  enabled at (3599): [<ffff80000919053c>]
>> _raw_spin_unlock_irqrestore+0x98/0x9c
>> hardirqs last disabled at (3598): [<ffff800009190ba4>]
>> _raw_spin_lock_irqsave+0xc0/0xcc
>> softirqs last  enabled at (3554): [<ffff800008010470>] _stext+0x470/0x5e8
>> softirqs last disabled at (3549): [<ffff8000080a4514>]
>> __irq_exit_rcu+0x180/0x1ac
>> ---[ end trace 0000000000000000 ]---
>>
>> I don't see any direct relation between the $subject and the above log,
>> but reverting the $subject on top of linux next-20220429 hides/fixes it.
>> Maybe there is a kind of memory trashing somewhere there and your change
>> only revealed it?
> Thanks for the report. I think the error path of my patch is wrong - I
> should not kfree(rpdev->driver_override) from the rpmsg code. That's the
> only thing I see now...
>
> Could you test following patch and tell if it helps?
> https://pastebin.ubuntu.com/p/rp3q9Z5fXj/

This doesn't help, the issue is still reported.

Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland