Re: [PATCH v2] i2c: rk3x: fix potential spinlock recursion on poll

From: Jensen Huang
Date: Fri Dec 08 2023 - 03:54:08 EST


On Fri, Dec 8, 2023 at 12:00 AM Dmitry Osipenko
<dmitry.osipenko@xxxxxxxxxxxxx> wrote:
>
> On 12/7/23 17:10, Dragan Simic wrote:
> > On 2023-12-07 10:25, Jensen Huang wrote:
> >> On Thu, Dec 7, 2023 at 4:37 PM Dragan Simic <dsimic@xxxxxxxxxxx> wrote:
> >>>
> >>> On 2023-12-07 09:21, Jensen Huang wrote:
> >>> > Possible deadlock scenario (on reboot):
> >>> > rk3x_i2c_xfer_common(polling)
> >>> > -> rk3x_i2c_wait_xfer_poll()
> >>> > -> rk3x_i2c_irq(0, i2c);
> >>> > --> spin_lock(&i2c->lock);
> >>> > ...
> >>> > <rk3x i2c interrupt>
> >>> > -> rk3x_i2c_irq(0, i2c);
> >>> > --> spin_lock(&i2c->lock); (deadlock here)
> >>> >
> >>> > Store the IRQ number and disable/enable it around the polling
> >>> transfer.
> >>> > This patch has been tested on NanoPC-T4.
> >>>
> >>> In case you haven't already seen the related discussion linked below,
> >>> please have a look. I also added more people to the list of recipients,
> >>> in an attempt to make everyone aware of the different approaches to
> >>> solving this issue.
> >>>
> >>> https://lore.kernel.org/all/655177f4.050a0220.d85c9.3ba0@xxxxxxxxxxxxx/T/#m6fc9c214452fec6681843e7f455978c35c6f6c8b
> >>
> >> Thank you for providing the information. I hadn't seen this link before.
> >> After carefully looking into the related discussion, it appears that
> >> Dmitry Osipenko is already working on a suitable patch. To avoid
> >> duplication
> >> or conflicts, my patch can be discarded.
> >
> > Thank you for responding so quickly. Perhaps it would be best to hear
> > from Dmitry as well, before discarding anything. It's been a while
> > since Dmitry wrote about working on the patch, so he might have
> > abandoned it.
>
> This patch is okay. In general, will be better to have IRQ disabled by
> default like I did in my variant, it should allow to remove the spinlock
> entirely. Of course this also can be done later on in a follow up
> patches. Jensen, feel free to use my variant of the patch, add my
> s-o-b+co-developed tags to the commit msg if you'll do. Otherwise I'll
> be able to send my patch next week.

Thank you for the suggestion. I've updated the patch to your variant, and
as confirmed by others, reboots are functioning correctly. I measured the
overhead of enable_irq/disable_irq() by calculating ktime in the
updated version,
and on rk3399, the minimum delta I observed was 291/875 ns. This extra
cost may impact most interrupt-based transfers. Therefore, I personally lean
towards the current v2 patch and handle the spinlock and irqsave/restore in
a follow up patch. I'd like to hear everyone's thoughts on this.

--
Best regards,
Jensen