Re: [PATCH v2] i2c: rk3x: fix potential spinlock recursion on poll

From: Dmitry Osipenko
Date: Fri Dec 08 2023 - 07:17:38 EST


On 12/8/23 11:53, Jensen Huang wrote:
> On Fri, Dec 8, 2023 at 12:00 AM Dmitry Osipenko
> <dmitry.osipenko@xxxxxxxxxxxxx> wrote:
>>
>> On 12/7/23 17:10, Dragan Simic wrote:
>>> On 2023-12-07 10:25, Jensen Huang wrote:
>>>> On Thu, Dec 7, 2023 at 4:37 PM Dragan Simic <dsimic@xxxxxxxxxxx> wrote:
>>>>>
>>>>> On 2023-12-07 09:21, Jensen Huang wrote:
>>>>>> Possible deadlock scenario (on reboot):
>>>>>> rk3x_i2c_xfer_common(polling)
>>>>>> -> rk3x_i2c_wait_xfer_poll()
>>>>>> -> rk3x_i2c_irq(0, i2c);
>>>>>> --> spin_lock(&i2c->lock);
>>>>>> ...
>>>>>> <rk3x i2c interrupt>
>>>>>> -> rk3x_i2c_irq(0, i2c);
>>>>>> --> spin_lock(&i2c->lock); (deadlock here)
>>>>>>
>>>>>> Store the IRQ number and disable/enable it around the polling
>>>>> transfer.
>>>>>> This patch has been tested on NanoPC-T4.
>>>>>
>>>>> In case you haven't already seen the related discussion linked below,
>>>>> please have a look. I also added more people to the list of recipients,
>>>>> in an attempt to make everyone aware of the different approaches to
>>>>> solving this issue.
>>>>>
>>>>> https://lore.kernel.org/all/655177f4.050a0220.d85c9.3ba0@xxxxxxxxxxxxx/T/#m6fc9c214452fec6681843e7f455978c35c6f6c8b
>>>>
>>>> Thank you for providing the information. I hadn't seen this link before.
>>>> After carefully looking into the related discussion, it appears that
>>>> Dmitry Osipenko is already working on a suitable patch. To avoid
>>>> duplication
>>>> or conflicts, my patch can be discarded.
>>>
>>> Thank you for responding so quickly. Perhaps it would be best to hear
>>> from Dmitry as well, before discarding anything. It's been a while
>>> since Dmitry wrote about working on the patch, so he might have
>>> abandoned it.
>>
>> This patch is okay. In general, will be better to have IRQ disabled by
>> default like I did in my variant, it should allow to remove the spinlock
>> entirely. Of course this also can be done later on in a follow up
>> patches. Jensen, feel free to use my variant of the patch, add my
>> s-o-b+co-developed tags to the commit msg if you'll do. Otherwise I'll
>> be able to send my patch next week.
>
> Thank you for the suggestion. I've updated the patch to your variant, and
> as confirmed by others, reboots are functioning correctly. I measured the
> overhead of enable_irq/disable_irq() by calculating ktime in the
> updated version,
> and on rk3399, the minimum delta I observed was 291/875 ns. This extra
> cost may impact most interrupt-based transfers. Therefore, I personally lean
> towards the current v2 patch and handle the spinlock and irqsave/restore in
> a follow up patch. I'd like to hear everyone's thoughts on this.

Please don't use ktime for perf measurements, ktime itself is a slow API
and it should be 200us that ktime takes itself. Also, the 0.2us is
practically nothing, especially compared to I2C transfers measured in ms.

I'm fine with keeping your v2 variant for the bug fix if you prefer
that. Thanks for addressing the issue :)

--
Best regards,
Dmitry