Re: [PATCH v2 1/2] dmaengine: xilinx: dpdma: Fix race condition in vsync IRQ

From: Sean Anderson
Date: Tue Mar 12 2024 - 13:46:55 EST


Hi Vishal,

On 2/27/24 23:21, Vishal Sagar wrote:
> From: Neel Gandhi <neel.gandhi@xxxxxxxxxx>
>
> The vchan_next_desc() function, called from
> xilinx_dpdma_chan_queue_transfer(), must be called with
> virt_dma_chan.lock held. This isn't correctly handled in all code paths,
> resulting in a race condition between the .device_issue_pending()
> handler and the IRQ handler which causes DMA to randomly stop. Fix it by
> taking the lock around xilinx_dpdma_chan_queue_transfer() calls that are
> missing it.
>
> Signed-off-by: Neel Gandhi <neel.gandhi@xxxxxxx>
> Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xxxxxxx>
> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@xxxxxxxxxxxxxxxx>
> Signed-off-by: Vishal Sagar <vishal.sagar@xxxxxxx>
>
> Link: https://cas5-0-urlprotect.trendmicro.com:443/wis/clicktime/v1/query?url=https%3a%2f%2flore.kernel.org%2fall%2f20220122121407.11467%2d1%2dneel.gandhi%40xilinx.com&umid=a486940f-2fe3-47f4-9b3f-416e59036eab&auth=d807158c60b7d2502abde8a2fc01f40662980862-a75e22540e8429d70f26093b45d38995a0e6e1e8
> ---
> drivers/dma/xilinx/xilinx_dpdma.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/dma/xilinx/xilinx_dpdma.c b/drivers/dma/xilinx/xilinx_dpdma.c
> index b82815e64d24..28d9af8f00f0 100644
> --- a/drivers/dma/xilinx/xilinx_dpdma.c
> +++ b/drivers/dma/xilinx/xilinx_dpdma.c
> @@ -1097,12 +1097,14 @@ static void xilinx_dpdma_chan_vsync_irq(struct xilinx_dpdma_chan *chan)
> * Complete the active descriptor, if any, promote the pending
> * descriptor to active, and queue the next transfer, if any.
> */
> + spin_lock(&chan->vchan.lock);
> if (chan->desc.active)
> vchan_cookie_complete(&chan->desc.active->vdesc);
> chan->desc.active = pending;
> chan->desc.pending = NULL;
>
> xilinx_dpdma_chan_queue_transfer(chan);
> + spin_unlock(&chan->vchan.lock);
>
> out:
> spin_unlock_irqrestore(&chan->lock, flags);
> @@ -1264,10 +1266,12 @@ static void xilinx_dpdma_issue_pending(struct dma_chan *dchan)
> struct xilinx_dpdma_chan *chan = to_xilinx_chan(dchan);
> unsigned long flags;
>
> - spin_lock_irqsave(&chan->vchan.lock, flags);
> + spin_lock_irqsave(&chan->lock, flags);
> + spin_lock(&chan->vchan.lock);
> if (vchan_issue_pending(&chan->vchan))
> xilinx_dpdma_chan_queue_transfer(chan);
> - spin_unlock_irqrestore(&chan->vchan.lock, flags);
> + spin_unlock(&chan->vchan.lock);
> + spin_unlock_irqrestore(&chan->lock, flags);
> }
>
> static int xilinx_dpdma_config(struct dma_chan *dchan,
> @@ -1495,7 +1499,9 @@ static void xilinx_dpdma_chan_err_task(struct tasklet_struct *t)
> XILINX_DPDMA_EINTR_CHAN_ERR_MASK << chan->id);
>
> spin_lock_irqsave(&chan->lock, flags);
> + spin_lock(&chan->vchan.lock);
> xilinx_dpdma_chan_queue_transfer(chan);
> + spin_unlock(&chan->vchan.lock);
> spin_unlock_irqrestore(&chan->lock, flags);
> }

I also ran into this issue and came up with the same fix [1].

Reviewed-by: Sean Anderson <sean.anderson@xxxxxxxxx>

[1] https://lore.kernel.org/dmaengine/20240308210034.3634938-2-sean.anderson@xxxxxxxxx/