Re: [PATCH] gpio: gpiolib-cdev: Fix potential &lr->wait.lock deadlock issue

From: Kent Gibson
Date: Mon Jun 26 2023 - 03:23:37 EST


On Sun, Jun 25, 2023 at 02:45:12PM +0000, YE Chengfeng wrote:
> linereq_put_event is called from both interrupt context (e.g.,
> edge_irq_thread) and process context (process_hw_ts_thread).
> Therefore, interrupt should be disabled before acquiring lock
> &lr->wait.lock inside linereq_put_event to avoid deadlock when
> the lock is held in process context and edge_irq_thread comes.
>
> Similarly, linereq_read_unlocked running in process context
> also acquies the same lock. It also need to disable interrupt
> otherwise deadlock could happen if the irq edge_irq_thread
> comes to execution while the lock is held.
>

So, in both cases, a process context holding the lock is interrupted, on
the same CPU, and the edge_irq_thread() deadlocks on that lock, as the
interrupted process holds the lock and cannot proceed.
That makes sense to me, but it would be good for Bart to confirm as he
knows a lot more about the kfifo locking than I do.

Note that the same problem also exists in lineevent_read_unlocked() - the
uAPI v1 equivalent of linereq_read_unlocked().

> Fix the two potential deadlock issues by spin_lock_irqsave.
>

spin_lock_bh() should be sufficient, given that edge_irq_thread() is run
in a softirq? That is faster and would allow the hard irq handlers to
still run, and timestamp the event, but inhibit the edge_irq_thread()
from being called on that CPU until the lock is released.
(hmmm, gpio_desc_to_lineinfo() also uses spin_lock_irqsave() but it is
never called from hard irq context, so there is a good chance I'm missing
something here??)
More on spin_lock choice below.

This should have a Fixes tag.
For v2, it has been there since it was added, so:

73e0341992b6 ("gpiolib: cdev: support edge detection for uAPI v2")

And it also applies to lineevent_read_unlocked() from uAPI v1, so there
should be a separate fix for that, or at least a separate tag.

I looks to me that it was first introduced in uAPI v1 here:

dea9c80ee672 ("gpiolib: rework the locking mechanism for lineevent kfifo")

> Signed-off-by: Chengfeng Ye <cyeaa@xxxxxxxxxxxxxx>
> ---
> drivers/gpio/gpiolib-cdev.c | 16 +++++++++-------
> 1 file changed, 9 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpio/gpiolib-cdev.c b/drivers/gpio/gpiolib-cdev.c
> index 0a33971c964c..714631fde9a8 100644
> --- a/drivers/gpio/gpiolib-cdev.c
> +++ b/drivers/gpio/gpiolib-cdev.c
> @@ -614,14 +614,15 @@ static void linereq_put_event(struct linereq *lr,
> struct gpio_v2_line_event *le)
> {
> bool overflow = false;
> + unsigned long flags;
>
> - spin_lock(&lr->wait.lock);
> + spin_lock_irqsave(&lr->wait.lock, flags);

linereq_put_event() is never called from hard irq context, so
spin_lock_irq() or spin_lock_bh() should suffice?

> if (kfifo_is_full(&lr->events)) {
> overflow = true;
> kfifo_skip(&lr->events);
> }
> kfifo_in(&lr->events, le, 1);
> - spin_unlock(&lr->wait.lock);
> + spin_unlock_irqrestore(&lr->wait.lock, flags);
> if (!overflow)
> wake_up_poll(&lr->wait, EPOLLIN);
> else
> @@ -1505,6 +1506,7 @@ static ssize_t linereq_read_unlocked(struct file *file, char __user *buf,
> struct linereq *lr = file->private_data;
> struct gpio_v2_line_event le;
> ssize_t bytes_read = 0;
> + unsigned long flags;
> int ret;
>
> if (!lr->gdev->chip)
> @@ -1514,28 +1516,28 @@ static ssize_t linereq_read_unlocked(struct file *file, char __user *buf,
> return -EINVAL;
>
> do {
> - spin_lock(&lr->wait.lock);
> + spin_lock_irqsave(&lr->wait.lock, flags);

linereq_read_unlocked() is only ever called in process context, so this
could be spin_lock_irq() or even spin_lock_bh()?

> if (kfifo_is_empty(&lr->events)) {
> if (bytes_read) {
> - spin_unlock(&lr->wait.lock);
> + spin_unlock_irqrestore(&lr->wait.lock, flags);
> return bytes_read;
> }
>
> if (file->f_flags & O_NONBLOCK) {
> - spin_unlock(&lr->wait.lock);
> + spin_unlock_irqrestore(&lr->wait.lock, flags);
> return -EAGAIN;
> }
>
> ret = wait_event_interruptible_locked(lr->wait,
> !kfifo_is_empty(&lr->events));

wait_event_interruptible_locked() works with locks that are
spin_lock()/spin_unlock(), so this will leave irqs disabled while
waiting for a new event??

And while there is a wait_event_interruptible_locked_irq(), there is
no wait_event_interruptible_locked_bh() form that I can see, so using
spin_lock_bh() would require some extra work.

> if (ret) {
> - spin_unlock(&lr->wait.lock);
> + spin_unlock_irqrestore(&lr->wait.lock, flags);
> return ret;
> }
> }
>
> ret = kfifo_out(&lr->events, &le, 1);
> - spin_unlock(&lr->wait.lock);
> + spin_unlock_irqrestore(&lr->wait.lock, flags);
> if (ret != 1) {
> /*
> * This should never happen - we were holding the
> --
> 2.17.1

Anyway, good catch.

Cheers,
Kent.