Re: [PATCH 3/8] dmaengine: hisilicon: Add multi-thread support for a DMA channel

From: haijie
Date: Mon Jun 27 2022 - 09:16:47 EST



在 2022/6/27 14:51, - 写道:
-----Original Message-----
From: Vinod Koul [mailto:vkoul@xxxxxxxxxx]
Sent: Monday, June 27, 2022 2:21 PM
To: haijie <haijie1@xxxxxxxxxx>
Cc: Wangzhou (B) <wangzhou1@xxxxxxxxxxxxx>; dmaengine@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
Subject: Re: [PATCH 3/8] dmaengine: hisilicon: Add multi-thread support for a DMA channel

On 25-06-22, 15:44, Jie Hai wrote:
When we get a DMA channel and try to use it in multiple threads it
will cause oops and hanging the system.

% echo 100 > /sys/module/dmatest/parameters/threads_per_chan
% echo 100 > /sys/module/dmatest/parameters/iterations
% echo 1 > /sys/module/dmatest/parameters/run
[383493.327077] Unable to handle kernel paging request at virtual
address dead000000000108
[383493.335103] Mem abort info:
[383493.335103] ESR = 0x96000044
[383493.335105] EC = 0x25: DABT (current EL), IL = 32 bits
[383493.335107] SET = 0, FnV = 0
[383493.335108] EA = 0, S1PTW = 0
[383493.335109] FSC = 0x04: level 0 translation fault
[383493.335110] Data abort info:
[383493.335111] ISV = 0, ISS = 0x00000044
[383493.364739] CM = 0, WnR = 1
[383493.367793] [dead000000000108] address between user and kernel
address ranges
[383493.375021] Internal error: Oops: 96000044 [#1] PREEMPT SMP
[383493.437574] CPU: 63 PID: 27895 Comm: dma0chan0-copy2 Kdump:
loaded Tainted: GO 5.17.0-rc4+ #2
[383493.457851] pstate: 204000c9 (nzCv daIF +PAN -UAO -TCO -DIT
-SSBS BTYPE=--)
[383493.465331] pc : vchan_tx_submit+0x64/0xa0 [383493.469957] lr :
vchan_tx_submit+0x34/0xa0

This happens because of data race. Each thread rewrite channels's
descriptor as soon as device_issue_pending is called. It leads to the
situation that the driver thinks that it uses the right descriptor in
interrupt handler while channels's descriptor has been changed by
other thread.

With current fixes channels's descriptor changes it's value only when
it has been used. A new descriptor is acquired from
vc->desc_issued queue that is already filled with descriptors
that are ready to be sent. Threads have no direct access to DMA
channel descriptor. Now it is just possible to queue a descriptor for
further processing.

Fixes: e9f08b65250d ("dmaengine: hisilicon: Add Kunpeng DMA engine
support")
Signed-off-by: Jie Hai <haijie1@xxxxxxxxxx>
---
drivers/dma/hisi_dma.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/dma/hisi_dma.c b/drivers/dma/hisi_dma.c index
0a0f8a4d168a..0385419be8d5 100644
--- a/drivers/dma/hisi_dma.c
+++ b/drivers/dma/hisi_dma.c
@@ -271,7 +271,6 @@ static void hisi_dma_start_transfer(struct
hisi_dma_chan *chan)
vd = vchan_next_desc(&chan->vc);
if (!vd) {
- dev_err(&hdma_dev->pdev->dev, "no issued task!\n");
how is this a fix?

With current fixes, hisi_dma_start_transfer may be called twice for one desc.

If channel's desc is NULL, When hisi_dma_issue_pending
chan->desc = NULL;
return;
}
@@ -303,7 +302,7 @@ static void hisi_dma_issue_pending(struct dma_chan
*c)
spin_lock_irqsave(&chan->vc.lock, flags);
- if (vchan_issue_pending(&chan->vc))
+ if (vchan_issue_pending(&chan->vc) && !chan->desc)
This looks good

hisi_dma_start_transfer(chan);
spin_unlock_irqrestore(&chan->vc.lock, flags); @@ -442,11 +441,10 @@
static irqreturn_t hisi_dma_irq(int irq, void *data)
chan->qp_num, chan->cq_head);
if (FIELD_GET(STATUS_MASK, cqe->w0) == STATUS_SUCC) {
vchan_cookie_complete(&desc->vd);
+ hisi_dma_start_transfer(chan);
Why should this fix the error reported?

} else {
dev_err(&hdma_dev->pdev->dev, "task error!\n");
}
-
- chan->desc = NULL;
}
spin_unlock(&chan->vc.lock);
--
2.33.0
--
~Vinod
.