RE: [PATCH] spi: spi-geni-qcom: Add support for SE DMA mode

From: Vijaya Krishna Nivarthi (Temp) (QUIC)
Date: Mon Nov 21 2022 - 09:24:00 EST


Hi,


> -----Original Message-----
> From: Doug Anderson <dianders@xxxxxxxxxxxx>
> Sent: Friday, November 18, 2022 5:00 AM
> To: Vijaya Krishna Nivarthi (Temp) (QUIC) <quic_vnivarth@xxxxxxxxxxx>
> Cc: agross@xxxxxxxxxx; andersson@xxxxxxxxxx; konrad.dybcio@xxxxxxxxxx;
> broonie@xxxxxxxxxx; linux-arm-msm@xxxxxxxxxxxxxxx; linux-
> spi@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Mukesh Savaliya (QUIC)
> <quic_msavaliy@xxxxxxxxxxx>; mka@xxxxxxxxxxxx;
> swboyd@xxxxxxxxxxxx; Visweswara Tanuku (QUIC)
> <quic_vtanuku@xxxxxxxxxxx>; vkoul@xxxxxxxxxx
> Subject: Re: [PATCH] spi: spi-geni-qcom: Add support for SE DMA mode
>
> WARNING: This email originated from outside of Qualcomm. Please be wary
> of any links or attachments, and do not enable macros.
>
> Hi,
>
> On Wed, Nov 16, 2022 at 1:15 AM Vijaya Krishna Nivarthi
> <quic_vnivarth@xxxxxxxxxxx> wrote:
> >
> > @@ -95,6 +97,7 @@ struct spi_geni_master {
> > struct dma_chan *tx;
> > struct dma_chan *rx;
> > int cur_xfer_mode;
> > + u32 cur_m_cmd;
>
> I don't think you need to store "cur_m_cmd". The only thing you do is check
> for the SPI_TX_ONLY / SPI_RX_ONLY bits and instead you can just check if
> cur_xfer->tx_buf / cur_xfer->rx_buf are NULL.
>
>

Please note that cur_xfer can be NULL.
Added further to comments.

> > @@ -129,23 +132,27 @@ static int get_spi_clk_cfg(unsigned int speed_hz,
> > return ret;
> > }
> >
> > -static void handle_fifo_timeout(struct spi_master *spi,
> > +static void handle_se_timeout(struct spi_master *spi,
> > struct spi_message *msg) {
> > struct spi_geni_master *mas = spi_master_get_devdata(spi);
> > unsigned long time_left;
> > struct geni_se *se = &mas->se;
> > + const struct spi_transfer *xfer;
> >
> > spin_lock_irq(&mas->lock);
> > reinit_completion(&mas->cancel_done);
> > - writel(0, se->base + SE_GENI_TX_WATERMARK_REG);
> > + if (mas->cur_xfer_mode == GENI_SE_FIFO)
> > + writel(0, se->base + SE_GENI_TX_WATERMARK_REG);
> > + if (mas->cur_xfer_mode == GENI_SE_DMA)
> > + xfer = mas->cur_xfer;
>
> You could probably just get rid of the "if" test and always store
> xfer. You'll only use it below if you're in GENI_SE_DMA but you might
> as well save the "if" test...
>
>

Done.

> > @@ -162,6 +169,44 @@ static void handle_fifo_timeout(struct spi_master
> *spi,
> > */
> > mas->abort_failed = true;
> > }
> > +
> > +unmap_if_dma:
> > + if (mas->cur_xfer_mode == GENI_SE_DMA) {
> > + if (xfer) {
> > + if (xfer->tx_buf && xfer->tx_dma)
> > + geni_se_tx_dma_unprep(se, xfer->tx_dma, xfer->len);
> > + if (xfer->rx_buf && xfer->rx_dma)
> > + geni_se_rx_dma_unprep(se, xfer->rx_dma, xfer->len);
> > + } else {
> > + /*
> > + * This can happen if a timeout happened and we had to wait
> > + * for lock in this function because isr was holding the lock
> > + * and handling transfer completion at that time.
> > + * Unnecessary error but cannot be helped.
> > + * Only do reset, dma_unprep is already done by isr.
> > + */
> > + dev_err(mas->dev, "Cancel/Abort on completed SPI
> transfer\n");
> > + }
> > +
> > + if (mas->cur_m_cmd & SPI_TX_ONLY) {
> > + spin_lock_irq(&mas->lock);
> > + reinit_completion(&mas->tx_reset_done);
> > + writel_relaxed(1, se->base + SE_DMA_TX_FSM_RST);
>
> Why "relaxed"? In general the "relaxed" variants should only be used
> in cases where we are reasonably certain we're in a critical path and
> the regular writel() should be the default.
>
>

Done. Thank you.

> > + spin_unlock_irq(&mas->lock);
> > + time_left = wait_for_completion_timeout(&mas-
> >tx_reset_done, HZ);
> > + if (!time_left)
> > + dev_err(mas->dev, "DMA TX RESET failed\n");
> > + }
> > + if (mas->cur_m_cmd & SPI_RX_ONLY) {
> > + spin_lock_irq(&mas->lock);
> > + reinit_completion(&mas->rx_reset_done);
> > + writel_relaxed(1, se->base + SE_DMA_RX_FSM_RST);
>
> Why "relaxed"?
>
>

Done.


> > + spin_unlock_irq(&mas->lock);
> > + time_left = wait_for_completion_timeout(&mas-
> >rx_reset_done, HZ);
> > + if (!time_left)
> > + dev_err(mas->dev, "DMA RX RESET failed\n");
> > + }
>
> Comparing how the i2c driver does "se DMA", I notice that it does the
> FSM reset _before_ calling geni_se_tx_dma_unprep() /
> geni_se_rx_dma_unprep(). You do it in the opposite order. Are both
> orders really OK?
>
>

Swapped the order to be consistent.
Earlier one worked fine too but agree that this is correct.

> > @@ -482,8 +528,11 @@ static bool geni_can_dma(struct spi_controller
> *ctlr,
> > {
> > struct spi_geni_master *mas = spi_master_get_devdata(slv->master);
> >
> > - /* check if dma is supported */
> > - return mas->cur_xfer_mode != GENI_SE_FIFO;
> > + /*
> > + * return true if transfer needs to be mapped prior to
> > + * calling transfer_one which is the case only for GPI_DMA
> > + */
> > + return mas->cur_xfer_mode == GENI_GPI_DMA;
>
> Comments should say _why_ we don't need the transfer to be mapped by
> the SPI framework for SE_DMA? This is because geni_se_rx_dma_prep() /
> geni_se_tx_dma_prep() does it for you?
>
>

Added comments.


> > @@ -494,6 +543,7 @@ static int spi_geni_prepare_message(struct
> spi_master *spi,
> >
> > switch (mas->cur_xfer_mode) {
> > case GENI_SE_FIFO:
> > + case GENI_SE_DMA:
> > if (spi_geni_is_abort_still_pending(mas))
> > return -EBUSY;
> > ret = setup_fifo_params(spi_msg->spi, spi);
> > @@ -604,8 +654,8 @@ static int spi_geni_init(struct spi_geni_master *mas)
> > fallthrough;
>
> Right above this line there's a warning that says: "FIFO mode
> disabled, but couldn't get DMA, fall back to FIFO mode". It was never
> clear to me if that truly worked. ...but now I wonder even more. If it
> was OK to fall back to FIFO mode when GPI failed to init, is it also
> OK to fall back to SE_DMA mode in that case?
>
>

I vaguely remember from last time while working with gpi mode that fall through wasn’t working.
But if falling through to FIFO mode works, falling through to DMA should work too.

Can we track this separately with a bug for gpi mode?
Like "Test and bring up fallthrough from GPI to SE"?


> > case 0:
> > - mas->cur_xfer_mode = GENI_SE_FIFO;
> > - geni_se_select_mode(se, GENI_SE_FIFO);
> > + mas->cur_xfer_mode = GENI_SE_DMA;
> > + geni_se_select_mode(se, GENI_SE_DMA);
>
> Do we even need the above two lines? Don't we change the mode between
> FIFO/DMA each time based on the transfer size?
>
>

Agreed that this change is not required.
Maybe its ok to keep it anyway for Initialisation to default DMA?


> > @@ -716,14 +766,14 @@ static void geni_spi_handle_rx(struct
> spi_geni_master *mas)
> > mas->rx_rem_bytes -= rx_bytes;
> > }
> >
> > -static void setup_fifo_xfer(struct spi_transfer *xfer,
> > +static int setup_se_xfer(struct spi_transfer *xfer,
> > struct spi_geni_master *mas,
> > u16 mode, struct spi_master *spi)
> > {
> > u32 m_cmd = 0;
> > - u32 len;
> > + u32 len, fifo_size = 0;
>
> You don't need to initialize "fifo_size".
>
>

Removed.

> > struct geni_se *se = &mas->se;
> > - int ret;
> > + int ret = 0;
>
> You don't need to initialize "ret".
>
>

Removed.


> > @@ -771,6 +821,13 @@ static void setup_fifo_xfer(struct spi_transfer
> *xfer,
> > writel(len, se->base + SE_SPI_RX_TRANS_LEN);
> > mas->rx_rem_bytes = xfer->len;
> > }
> > + mas->cur_m_cmd = m_cmd;
> > +
> > + /* Select transfer mode based on transfer length */
> > + fifo_size =
> > + (mas->tx_fifo_depth * mas->fifo_width_bits / mas-
> >cur_bits_per_word);
>
> Parentheses here are unnecessary.
>
>

Removed.

> > @@ -778,11 +835,36 @@ static void setup_fifo_xfer(struct spi_transfer
> *xfer,
> > */
> > spin_lock_irq(&mas->lock);
> > geni_se_setup_m_cmd(se, m_cmd, FRAGMENTATION);
> > - if (m_cmd & SPI_TX_ONLY) {
> > +
> > + if (mas->cur_xfer_mode == GENI_SE_DMA) {
> > + if (m_cmd & SPI_RX_ONLY) {
> > + ret = geni_se_rx_dma_prep(se, xfer->rx_buf,
> > + xfer->len, &xfer->rx_dma);
>
> Is it truly legitimate to use "xfer->rx_dma" for your own purposes?
> That's supposed to be something populated by the SPI framework if
> can_dma() returns true and I'd be a bit nervous about using it. Are we
> sure we shouldn't use the SPI framework to handle our mapping?
>
>

I had this inkling too but couldn’t think of a scenario when it can go wrong.
Also, geni_se_functions were already doing the mapping, so it seemed simpler to make use of same.
Options:-
a) use SPI framework to handle mapping instead of geni_se for SE_DMA, similar to GPI_DMA(could be more involved change)
b) continue using geni_se for mapping, add tx/rx_dma buffers pointers to "struct spi_geni_master" and use them instead ones in xfer

Once again, Can we track this with a bug if not critical?


> > + if (ret || !xfer->rx_buf) {
>
> Remove the useless test for "xfer->rx_buf". The only way the
> SPI_RX_ONLY bit could be set is if rx_buf was non-NULL and that test
> was only a few lines earlier.
>
>

Removed.

> > + dev_err(mas->dev, "Failed to setup Rx dma %d\n", ret);
> > + xfer->rx_dma = 0;
> > + goto unlock_and_return;
> > + }
> > + }
> > + if (m_cmd & SPI_TX_ONLY) {
> > + ret = geni_se_tx_dma_prep(se, (void *)xfer->tx_buf,
>
> Why the extra cast to "void *". You don't need it and don't have it on
> the rx side.
>
>

Cast is required because tx_buf is const void*

const void *tx_buf;
void *rx_buf;

> > + xfer->len, &xfer->tx_dma);
> > + if (ret || !xfer->tx_buf) {
>
> Remove the useless test for "xfer->tx_buf"
>
>

Done.

> > + dev_err(mas->dev, "Failed to setup Tx dma %d\n", ret);
> > + xfer->tx_dma = 0;
> > + goto unlock_and_return;
>
> Don't you need to undo geni_se_rx_dma_prep() if this is a full duplex
> transfer? ...and I guess clear xfer->rx_dma? Is there anything we need
> to do to undo the geni_se_setup_m_cmd() ?
>
>

Added unmap of rx, thank you.
No further undo required as rest of flow is same as FIFO.

> > + }
>
> Just out of curiosity, what exactly kicks off the transfer in the DMA
> case. I guess it knows if it's a TX, RX, or duplex transfer and
> magically kicks off when one or both of them are prepped?
>
>

The geni_se_*x_dma_prep functions prep the dma buffers and write them to DMA registers.
Excerpt from rx
*iova = dma_map_single(wrapper->dev, buf, len, DMA_FROM_DEVICE);
...
writel_relaxed(lower_32_bits(*iova), se->base + SE_DMA_RX_PTR_L);
writel_relaxed(upper_32_bits(*iova), se->base + SE_DMA_RX_PTR_H);
...
writel(len, se->base + SE_DMA_RX_LEN);
Did I get your question correct?

> > + }
> > + } else if (m_cmd & SPI_TX_ONLY) {
> > if (geni_spi_handle_tx(mas))
> > writel(mas->tx_wm, se->base +
> SE_GENI_TX_WATERMARK_REG);
> > }
> > +
> > +unlock_and_return:
> > spin_unlock_irq(&mas->lock);
> > + if (!ret)
> > + ret = 1;
>
> Move this extra "if (!ret) ret = 1;" to spi_geni_transfer_one() and
> add a comment explaining it (it's part of the API that the SPI
> framework expects if you're letting it wait for the transfer to
> complete or call the timeout function).
>
>

Done.


> > @@ -816,46 +899,73 @@ static irqreturn_t geni_spi_isr(int irq, void *data)
> > if (!m_irq)
> > return IRQ_NONE;
> >
> > - if (m_irq & (M_CMD_OVERRUN_EN | M_ILLEGAL_CMD_EN |
> M_CMD_FAILURE_EN |
> > - M_RX_FIFO_RD_ERR_EN | M_RX_FIFO_WR_ERR_EN |
> > - M_TX_FIFO_RD_ERR_EN | M_TX_FIFO_WR_ERR_EN))
> > - dev_warn(mas->dev, "Unexpected IRQ err status %#010x\n",
> m_irq);
> > -
>
> Could still leave the above check/warning even for SE_DMA mode, can't
> you? Then you can keep it outside the spinlock. Is there some reason
> that those errors are more impossible for SE_DMA than they are for
> FIFO?
>
>

Agree. Done.

Thank you,
Vijay/


> > spin_lock(&mas->lock);
> >
> > - if ((m_irq & M_RX_FIFO_WATERMARK_EN) || (m_irq &
> M_RX_FIFO_LAST_EN))
> > - geni_spi_handle_rx(mas);
> > -
> > - if (m_irq & M_TX_FIFO_WATERMARK_EN)
> > - geni_spi_handle_tx(mas);
> > -
> > - if (m_irq & M_CMD_DONE_EN) {
> > - if (mas->cur_xfer) {
> > + if (mas->cur_xfer_mode == GENI_SE_FIFO) {
> > + if (m_irq & (M_CMD_OVERRUN_EN | M_ILLEGAL_CMD_EN |
> M_CMD_FAILURE_EN |
> > + M_RX_FIFO_RD_ERR_EN | M_RX_FIFO_WR_ERR_EN |
> > + M_TX_FIFO_RD_ERR_EN | M_TX_FIFO_WR_ERR_EN))
> > + dev_warn(mas->dev, "Unexpected IRQ err status %#010x\n",
> m_irq);
> > +
> > + if ((m_irq & M_RX_FIFO_WATERMARK_EN) || (m_irq &
> M_RX_FIFO_LAST_EN))
> > + geni_spi_handle_rx(mas);
> > +
> > + if (m_irq & M_TX_FIFO_WATERMARK_EN)
> > + geni_spi_handle_tx(mas);
> > +
> > + if (m_irq & M_CMD_DONE_EN) {
> > + if (mas->cur_xfer) {
> > + spi_finalize_current_transfer(spi);
> > + mas->cur_xfer = NULL;
> > + /*
> > + * If this happens, then a CMD_DONE came before all the
> > + * Tx buffer bytes were sent out. This is unusual, log
> > + * this condition and disable the WM interrupt to
> > + * prevent the system from stalling due an interrupt
> > + * storm.
> > + *
> > + * If this happens when all Rx bytes haven't been
> > + * received, log the condition. The only known time
> > + * this can happen is if bits_per_word != 8 and some
> > + * registers that expect xfer lengths in num spi_words
> > + * weren't written correctly.
> > + */
> > + if (mas->tx_rem_bytes) {
> > + writel(0, se->base + SE_GENI_TX_WATERMARK_REG);
> > + dev_err(mas->dev, "Premature done. tx_rem = %d
> bpw%d\n",
> > + mas->tx_rem_bytes, mas->cur_bits_per_word);
> > + }
> > + if (mas->rx_rem_bytes)
> > + dev_err(mas->dev, "Premature done. rx_rem = %d
> bpw%d\n",
> > + mas->rx_rem_bytes, mas->cur_bits_per_word);
> > + } else {
> > + complete(&mas->cs_done);
> > + }
> > + }
> > + } else if (mas->cur_xfer_mode == GENI_SE_DMA) {
> > + const struct spi_transfer *xfer = mas->cur_xfer;
> > + u32 dma_tx_status = readl_relaxed(se->base +
> SE_DMA_TX_IRQ_STAT);
> > + u32 dma_rx_status = readl_relaxed(se->base +
> SE_DMA_RX_IRQ_STAT);
>
> The above makes me nervous that a full duplex transfer could finish
> right between the two register reads and something would be handled
> incorrectly. ...but it actually looks OK. In that case we'll just get
> another interrupt right after this one and finish everything. OK, no
> action needed just documenting my thoughts.
>
> -Doug