Re: [PATCH V8 3/5] i2c: tegra: Add DMA Support

From: Dmitry Osipenko
Date: Thu Jan 31 2019 - 19:52:29 EST


Ð Thu, 31 Jan 2019 13:44:23 +0100
Thierry Reding <thierry.reding@xxxxxxxxx> ÐÐÑÐÑ:

> On Wed, Jan 30, 2019 at 10:16:25PM -0800, Sowjanya Komatineni wrote:
> > This patch adds DMA support for Tegra I2C.
> >
> > Tegra I2C TX and RX FIFO depth is 8 words. PIO mode is used for
> > transfer size of the max FIFO depth and DMA mode is used for
> > transfer size higher than max FIFO depth to save CPU overhead.
> >
> > PIO mode needs full intervention of CPU to fill or empty FIFO's
> > and also need to service multiple data requests interrupt for the
> > same transaction. This adds delay between data bytes of the same
> > transfer when CPU is fully loaded and some slave devices has
> > internal timeout for no bus activity and stops transaction to
> > avoid bus hang. DMA mode is helpful in such cases.
> >
> > DMA mode is also helpful for Large transfers during downloading or
> > uploading FW over I2C to some external devices.
> >
> > Signed-off-by: Sowjanya Komatineni <skomatineni@xxxxxxxxxx>
> > ---
> > [V8] : Moved back dma init to i2c probe, removed
> > ALL_PACKETS_XFER_COMPLETE interrupt and using PACKETS_XFER_COMPLETE
> > interrupt only and some other fixes
> > Updated Kconfig for APB_DMA dependency
> > [V7] : Same as V6
> > [V6] : Updated for proper buffer allocation/freeing, channel
> > release. Updated to use exact xfer size for syncing dma buffer.
> > [V5] : Same as V4
> > [V4] : Updated to allocate DMA buffer only when DMA mode.
> > Updated to fall back to PIO mode when DMA channel request or
> > buffer allocation fails.
> > [V3] : Updated without additional buffer allocation.
> > [V2] : Updated based on V1 review feedback along with code cleanup
> > for proper implementation of DMA.
> >
> > drivers/i2c/busses/Kconfig | 2 +-
> > drivers/i2c/busses/i2c-tegra.c | 362
> > ++++++++++++++++++++++++++++++++++++++--- 2 files changed, 339
> > insertions(+), 25 deletions(-)
> >
> > diff --git a/drivers/i2c/busses/Kconfig b/drivers/i2c/busses/Kconfig
> > index f2c681971201..046aeb92a467 100644
> > --- a/drivers/i2c/busses/Kconfig
> > +++ b/drivers/i2c/busses/Kconfig
> > @@ -1016,7 +1016,7 @@ config I2C_SYNQUACER
> >
> > config I2C_TEGRA
> > tristate "NVIDIA Tegra internal I2C controller"
> > - depends on ARCH_TEGRA
> > + depends on (ARCH_TEGRA && TEGRA20_APB_DMA)
>
> Like I said in my reply in the v7 subthread, I don't think we want
> this. The dependency that we have is on the DMA engine API, not the
> APB DMA driver.
>
> Technically there could be a runtime problem if the APB DMA driver is
> disabled and we list a "dmas" property. If I understand correctly, the
> DMA engine API would always return -EPROBE_DEFER in that case. That's
> somewhat annoying, but I think that's fine because it points at an
> integration issue. It lets you know that the driver is relying on a
> resources that is not showing up, which usually means that either the
> provider's driver is not enabled or the provider is failing to probe.
>
> > help
> > If you say yes to this option, support will be included
> > for the I2C controller embedded in NVIDIA Tegra SOCs
> > diff --git a/drivers/i2c/busses/i2c-tegra.c
> > b/drivers/i2c/busses/i2c-tegra.c index c4892a47a483..025d63972e50
> > 100644 --- a/drivers/i2c/busses/i2c-tegra.c
> > +++ b/drivers/i2c/busses/i2c-tegra.c
> > @@ -8,6 +8,9 @@
> >
> > #include <linux/clk.h>
> > #include <linux/delay.h>
> > +#include <linux/dmaengine.h>
> > +#include <linux/dmapool.h>
> > +#include <linux/dma-mapping.h>
> > #include <linux/err.h>
> > #include <linux/i2c.h>
> > #include <linux/init.h>
> > @@ -44,6 +47,8 @@
> > #define I2C_FIFO_CONTROL_RX_FLUSH BIT(0)
> > #define I2C_FIFO_CONTROL_TX_TRIG_SHIFT 5
> > #define I2C_FIFO_CONTROL_RX_TRIG_SHIFT 2
> > +#define I2C_FIFO_CONTROL_TX_TRIG(x) (((x) - 1) << 5)
> > +#define I2C_FIFO_CONTROL_RX_TRIG(x) (((x) - 1) << 2)
> > #define I2C_FIFO_STATUS 0x060
> > #define I2C_FIFO_STATUS_TX_MASK 0xF0
> > #define I2C_FIFO_STATUS_TX_SHIFT 4
> > @@ -125,6 +130,19 @@
> > #define I2C_MST_FIFO_STATUS_TX_MASK 0xff0000
> > #define I2C_MST_FIFO_STATUS_TX_SHIFT 16
> >
> > +/* Packet header size in bytes */
> > +#define I2C_PACKET_HEADER_SIZE 12
> > +
> > +#define DATA_DMA_DIR_TX (1 << 0)
> > +#define DATA_DMA_DIR_RX (1 << 1)
> > +
> > +/*
> > + * Upto I2C_PIO_MODE_MAX_LEN bytes, controller will use PIO mode,
> > + * above this, controller will use DMA to fill FIFO.
> > + * MAX PIO len is 20 bytes excluding packet header.
> > + */
> > +#define I2C_PIO_MODE_MAX_LEN 32
> > +
> > /*
> > * msg_end_type: The bus control which need to be send at end of
> > transfer.
> > * @MSG_END_STOP: Send stop pulse at end of transfer.
> > @@ -188,6 +206,7 @@ struct tegra_i2c_hw_feature {
> > * @fast_clk: clock reference for fast clock of I2C controller
> > * @rst: reset control for the I2C controller
> > * @base: ioremapped registers cookie
> > + * @base_phys: Physical base address of the I2C controller
> > * @cont_id: I2C controller ID, used for packet header
> > * @irq: IRQ number of transfer complete interrupt
> > * @irq_disabled: used to track whether or not the interrupt is
> > enabled @@ -201,6 +220,14 @@ struct tegra_i2c_hw_feature {
> > * @clk_divisor_non_hs_mode: clock divider for non-high-speed modes
> > * @is_multimaster_mode: track if I2C controller is in
> > multi-master mode
> > * @xfer_lock: lock to serialize transfer submission and processing
> > + * @has_dma: indicates if DMA can be utilized based on dma DT
> > bindings
>
> I don't think we need this. We can just rely on the DMA engine API to
> tell us if the "dmas" property isn't there.
>
> > + * @tx_dma_chan: DMA transmit channel
> > + * @rx_dma_chan: DMA receive channel
> > + * @dma_phys: handle to DMA resources
> > + * @dma_buf: pointer to allocated DMA buffer
> > + * @dma_buf_size: DMA buffer size
> > + * @is_curr_dma_xfer: indicates active DMA transfer
> > + * @dma_complete: DMA completion notifier
> > */
> > struct tegra_i2c_dev {
> > struct device *dev;
> > @@ -210,6 +237,7 @@ struct tegra_i2c_dev {
> > struct clk *fast_clk;
> > struct reset_control *rst;
> > void __iomem *base;
> > + phys_addr_t base_phys;
> > int cont_id;
> > int irq;
> > bool irq_disabled;
> > @@ -223,6 +251,14 @@ struct tegra_i2c_dev {
> > u16 clk_divisor_non_hs_mode;
> > bool is_multimaster_mode;
> > spinlock_t xfer_lock;
> > + bool has_dma;
> > + struct dma_chan *tx_dma_chan;
> > + struct dma_chan *rx_dma_chan;
> > + dma_addr_t dma_phys;
> > + u32 *dma_buf;
> > + unsigned int dma_buf_size;
> > + bool is_curr_dma_xfer;
> > + struct completion dma_complete;
> > };
> >
> > static void dvc_writel(struct tegra_i2c_dev *i2c_dev, u32 val,
> > @@ -291,6 +327,85 @@ static void tegra_i2c_unmask_irq(struct
> > tegra_i2c_dev *i2c_dev, u32 mask) i2c_writel(i2c_dev, int_mask,
> > I2C_INT_MASK); }
> >
> > +static void tegra_i2c_dma_complete(void *args)
> > +{
> > + struct tegra_i2c_dev *i2c_dev = args;
> > +
> > + complete(&i2c_dev->dma_complete);
> > +}
> > +
> > +static int tegra_i2c_dma_submit(struct tegra_i2c_dev *i2c_dev,
> > size_t len) +{
> > + struct dma_async_tx_descriptor *dma_desc;
> > + enum dma_transfer_direction dir;
> > + struct dma_chan *chan;
> > +
> > + dev_dbg(i2c_dev->dev, "starting DMA for length: %zu\n",
> > len);
> > + reinit_completion(&i2c_dev->dma_complete);
> > + dir = i2c_dev->msg_read ? DMA_DEV_TO_MEM : DMA_MEM_TO_DEV;
> > + chan = i2c_dev->msg_read ? i2c_dev->rx_dma_chan :
> > i2c_dev->tx_dma_chan;
> > + dma_desc = dmaengine_prep_slave_single(chan,
> > i2c_dev->dma_phys,
> > + len, dir,
> > DMA_PREP_INTERRUPT |
> > + DMA_CTRL_ACK);
> > + if (!dma_desc) {
> > + dev_err(i2c_dev->dev, "failed to get DMA
> > descriptor\n");
> > + return -EIO;
> > + }
> > +
> > + dma_desc->callback = tegra_i2c_dma_complete;
> > + dma_desc->callback_param = i2c_dev;
> > + dmaengine_submit(dma_desc);
> > + dma_async_issue_pending(chan);
> > + return 0;
> > +}
> > +
> > +static int tegra_i2c_init_dma_param(struct tegra_i2c_dev *i2c_dev)
> > +{
> > + struct dma_chan *dma_chan;
> > + u32 *dma_buf;
> > + dma_addr_t dma_phys;
> > +
> > + if (!i2c_dev->has_dma)
> > + return -EINVAL;
> > +
> > + if (!i2c_dev->rx_dma_chan) {
> > + dma_chan =
> > dma_request_slave_channel_reason(i2c_dev->dev, "rx");
> > + if (IS_ERR(dma_chan))
> > + return PTR_ERR(dma_chan);
>
> I think we want to fallback to PIO here if dma_chan is -ENODEV.
>
> > +
> > + i2c_dev->rx_dma_chan = dma_chan;
> > + }
> > +
> > + if (!i2c_dev->tx_dma_chan) {
> > + dma_chan =
> > dma_request_slave_channel_reason(i2c_dev->dev, "tx");
> > + if (IS_ERR(dma_chan))
> > + return PTR_ERR(dma_chan);
>
> Same here. We could use rx_dma_chan == NULL as a condition to detect
> that instead of the extra has_dma.
>
> > + i2c_dev->tx_dma_chan = dma_chan;
> > + }
>
> Although, I'm not exactly sure I understand what you're trying to
> achieve here. Shouldn't we move the channel request parts into probe
> and remove them from here? Otherwise it seems like we could get into
> a state where we keep trying to get the slave channels everytime we
> set up a DMA transfer, even if we already failed to do so during
> probe.
>
> > +
> > + if (!i2c_dev->dma_buf && i2c_dev->msg_buf_remaining) {
> > + dma_buf = dma_alloc_coherent(i2c_dev->dev,
> > + i2c_dev->dma_buf_size,
> > + &dma_phys,
> > GFP_KERNEL); +
> > + if (!dma_buf) {
> > + dev_err(i2c_dev->dev,
> > + "failed to allocate the DMA
> > buffer\n");
> > + dma_release_channel(i2c_dev->tx_dma_chan);
> > + dma_release_channel(i2c_dev->rx_dma_chan);
> > + i2c_dev->tx_dma_chan = NULL;
> > + i2c_dev->rx_dma_chan = NULL;
> > + return -ENOMEM;
> > + }
> > +
> > + i2c_dev->dma_buf = dma_buf;
> > + i2c_dev->dma_phys = dma_phys;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > static int tegra_i2c_flush_fifos(struct tegra_i2c_dev *i2c_dev)
> > {
> > unsigned long timeout = jiffies + HZ;
> > @@ -656,25 +771,38 @@ static irqreturn_t tegra_i2c_isr(int irq,
> > void *dev_id) if (i2c_dev->hw->supports_bus_clear && (status &
> > I2C_INT_BUS_CLR_DONE)) goto err;
> >
> > - if (i2c_dev->msg_read && (status &
> > I2C_INT_RX_FIFO_DATA_REQ)) {
> > - if (i2c_dev->msg_buf_remaining)
> > - tegra_i2c_empty_rx_fifo(i2c_dev);
> > - else
> > - BUG();
> > - }
> > + if (!i2c_dev->is_curr_dma_xfer) {
> > + if (i2c_dev->msg_read && (status &
> > I2C_INT_RX_FIFO_DATA_REQ)) {
> > + if (i2c_dev->msg_buf_remaining)
> > + tegra_i2c_empty_rx_fifo(i2c_dev);
> > + else
> > + BUG();
> > + }
> >
> > - if (!i2c_dev->msg_read && (status &
> > I2C_INT_TX_FIFO_DATA_REQ)) {
> > - if (i2c_dev->msg_buf_remaining)
> > - tegra_i2c_fill_tx_fifo(i2c_dev);
> > - else
> > - tegra_i2c_mask_irq(i2c_dev,
> > I2C_INT_TX_FIFO_DATA_REQ);
> > + if (!i2c_dev->msg_read &&
> > + (status & I2C_INT_TX_FIFO_DATA_REQ)) {
> > + if (i2c_dev->msg_buf_remaining)
> > + tegra_i2c_fill_tx_fifo(i2c_dev);
> > + else
> > + tegra_i2c_mask_irq(i2c_dev,
> > +
> > I2C_INT_TX_FIFO_DATA_REQ);
> > + }
> > }
> >
> > i2c_writel(i2c_dev, status, I2C_INT_STATUS);
> > if (i2c_dev->is_dvc)
> > dvc_writel(i2c_dev, DVC_STATUS_I2C_DONE_INTR,
> > DVC_STATUS);
> > + /*
> > + * During message read XFER_COMPLETE interrupt is
> > triggered prior to
> > + * DMA completion and during message write XFER_COMPLETE
> > interrupt is
> > + * triggered after DMA completion.
> > + * PACKETS_XFER_COMPLETE indicates completion of all bytes
> > of transfer.
> > + * so forcing msg_buf_remaining to 0 in DMA mode.
> > + */
> > if (status & I2C_INT_PACKET_XFER_COMPLETE) {
> > + if (i2c_dev->is_curr_dma_xfer)
> > + i2c_dev->msg_buf_remaining = 0;
> > BUG_ON(i2c_dev->msg_buf_remaining);
> > complete(&i2c_dev->msg_complete);
> > }
> > @@ -690,12 +818,69 @@ static irqreturn_t tegra_i2c_isr(int irq,
> > void *dev_id) if (i2c_dev->is_dvc)
> > dvc_writel(i2c_dev, DVC_STATUS_I2C_DONE_INTR,
> > DVC_STATUS);
> > + if (i2c_dev->is_curr_dma_xfer) {
> > + if (i2c_dev->msg_read)
> > +
> > dmaengine_terminate_all(i2c_dev->rx_dma_chan);
> > + else
> > +
> > dmaengine_terminate_all(i2c_dev->tx_dma_chan); +
> > + complete(&i2c_dev->dma_complete);
> > + }
> > +
> > complete(&i2c_dev->msg_complete);
> > done:
> > spin_unlock(&i2c_dev->xfer_lock);
> > return IRQ_HANDLED;
> > }
> >
> > +static void tegra_i2c_config_fifo_trig(struct tegra_i2c_dev
> > *i2c_dev,
> > + size_t len, int direction)
> > +{
> > + u32 val, reg;
> > + u8 dma_burst = 0;
> > + struct dma_slave_config dma_sconfig;
> > + struct dma_chan *chan;
> > +
> > + if (i2c_dev->hw->has_mst_fifo)
> > + reg = I2C_MST_FIFO_CONTROL;
> > + else
> > + reg = I2C_FIFO_CONTROL;
> > + val = i2c_readl(i2c_dev, reg);
> > +
> > + if (len & 0xF)
> > + dma_burst = 1;
> > + else if (len & 0x10)
> > + dma_burst = 4;
> > + else
> > + dma_burst = 8;
> > +
> > + if (direction == DATA_DMA_DIR_TX) {
> > + if (i2c_dev->hw->has_mst_fifo)
> > + val |=
> > I2C_MST_FIFO_CONTROL_TX_TRIG(dma_burst);
> > + else
> > + val |= I2C_FIFO_CONTROL_TX_TRIG(dma_burst);
> > + } else {
> > + if (i2c_dev->hw->has_mst_fifo)
> > + val |=
> > I2C_MST_FIFO_CONTROL_RX_TRIG(dma_burst);
> > + else
> > + val |= I2C_FIFO_CONTROL_RX_TRIG(dma_burst);
> > + }
> > + i2c_writel(i2c_dev, val, reg);
> > +
> > + if (direction == DATA_DMA_DIR_TX) {
> > + dma_sconfig.dst_addr = i2c_dev->base_phys +
> > I2C_TX_FIFO;
> > + dma_sconfig.dst_addr_width =
> > DMA_SLAVE_BUSWIDTH_4_BYTES;
> > + dma_sconfig.dst_maxburst = dma_burst;
> > + } else {
> > + dma_sconfig.src_addr = i2c_dev->base_phys +
> > I2C_RX_FIFO;
> > + dma_sconfig.src_addr_width =
> > DMA_SLAVE_BUSWIDTH_4_BYTES;
> > + dma_sconfig.src_maxburst = dma_burst;
> > + }
> > +
> > + chan = i2c_dev->msg_read ? i2c_dev->rx_dma_chan :
> > i2c_dev->tx_dma_chan;
> > + dmaengine_slave_config(chan, &dma_sconfig);
> > +}
> > +
> > static int tegra_i2c_issue_bus_clear(struct tegra_i2c_dev *i2c_dev)
> > {
> > int err;
> > @@ -740,6 +925,11 @@ static int tegra_i2c_xfer_msg(struct
> > tegra_i2c_dev *i2c_dev, u32 int_mask;
> > unsigned long time_left;
> > unsigned long flags;
> > + size_t xfer_size;
> > + u32 *buffer = 0;
>
> Usually this should be = NULL for pointers.
>
> > + int err = 0;
> > + bool dma = false;
> > + struct dma_chan *chan;
> >
> > tegra_i2c_flush_fifos(i2c_dev);
> >
> > @@ -749,19 +939,69 @@ static int tegra_i2c_xfer_msg(struct
> > tegra_i2c_dev *i2c_dev, i2c_dev->msg_read = (msg->flags & I2C_M_RD);
> > reinit_completion(&i2c_dev->msg_complete);
> >
> > + if (i2c_dev->msg_read)
> > + xfer_size = msg->len;
> > + else
> > + xfer_size = msg->len + I2C_PACKET_HEADER_SIZE;
> > +
> > + xfer_size = ALIGN(xfer_size, BYTES_PER_FIFO_WORD);
> > + dma = (xfer_size > I2C_PIO_MODE_MAX_LEN);
> > + if (dma) {
> > + err = tegra_i2c_init_dma_param(i2c_dev);
> > + if (err < 0) {
> > + dev_dbg(i2c_dev->dev, "switching to PIO
> > transfer\n");
> > + dma = false;
> > + }
>
> If we successfully got DMA channels at probe time, doesn't this turn
> into an error condition that is worth reporting? It seems to me like
> the only reason it could fail is if we fail the allocation, but then
> again, why don't we move the DMA buffer allocation into probe()? We
> already use a fixed size for that allocation, so there's no reason it
> couldn't be allocated at probe time.
>
> Seems like maybe you just overlooked that as you were moving around
> the code pieces.
>
> > + }
> > +
> > + i2c_dev->is_curr_dma_xfer = dma;
> > spin_lock_irqsave(&i2c_dev->xfer_lock, flags);
> >
> > int_mask = I2C_INT_NO_ACK | I2C_INT_ARBITRATION_LOST;
> > tegra_i2c_unmask_irq(i2c_dev, int_mask);
> >
> > + if (dma) {
> > + if (i2c_dev->msg_read) {
> > + chan = i2c_dev->rx_dma_chan;
> > + tegra_i2c_config_fifo_trig(i2c_dev,
> > xfer_size,
> > +
> > DATA_DMA_DIR_RX);
> > + dma_sync_single_for_device(i2c_dev->dev,
> > +
> > i2c_dev->dma_phys,
> > + xfer_size,
> > +
> > DMA_FROM_DEVICE);
>
> Do we really need this? We're not actually passing the device any
> data, so no caches to flush here. I we're cautious about flushing
> caches when we do write to the buffer (and I think we do that
> properly already), then there should be no need to do it here again.
>

IIUC, DMA API has a concept of buffer handing which tells to use
dma_sync_single_for_device() before issuing hardware job that touches
the buffer and to use dma_sync_single_for_cpu() after hardware done the
execution. In fact the CPU caches are getting flushed or invalidated as
appropriate in a result.

dma_sync_single_for_device(DMA_FROM_DEVICE) invalidates buffer in the
CPU cache, probably to avoid CPU evicting data from cache to
DRAM while hardware writes to the buffer. Hence this hunk is correct.

> > + err = tegra_i2c_dma_submit(i2c_dev,
> > xfer_size);
> > + if (err < 0) {
> > + dev_err(i2c_dev->dev,
> > + "starting RX DMA failed,
> > err %d\n",
> > + err);
> > + goto unlock;
> > + }
> > + } else {
> > + chan = i2c_dev->tx_dma_chan;
> > + tegra_i2c_config_fifo_trig(i2c_dev,
> > xfer_size,
> > +
> > DATA_DMA_DIR_TX);
> > + dma_sync_single_for_cpu(i2c_dev->dev,
> > + i2c_dev->dma_phys,
> > + xfer_size,
> > + DMA_TO_DEVICE);
>
> This, on the other hand seems correct because we need to invalidate
> the caches for this buffer to make sure the data that we put there
> doesn't get overwritten.

As I stated before in a comment to v6, this particular case of
dma_sync_single_for_cpu() usage is incorrect because CPU should take
ownership of the buffer after completion of hardwate job. But in fact
dma_sync_single_for_cpu(DMA_TO_DEVICE) is a NO-OP because CPU doesn't
need to flush or invalidate anything to take ownership of the buffer if
hardware did a read-only access.

>
> > + buffer = i2c_dev->dma_buf;
> > + }
> > + }
> > +
> > packet_header = (0 << PACKET_HEADER0_HEADER_SIZE_SHIFT) |
> > PACKET_HEADER0_PROTOCOL_I2C |
> > (i2c_dev->cont_id <<
> > PACKET_HEADER0_CONT_ID_SHIFT) | (1 <<
> > PACKET_HEADER0_PACKET_ID_SHIFT);
> > - i2c_writel(i2c_dev, packet_header, I2C_TX_FIFO);
> > + if (dma && !i2c_dev->msg_read)
> > + *buffer++ = packet_header;
> > + else
> > + i2c_writel(i2c_dev, packet_header, I2C_TX_FIFO);
> >
> > packet_header = msg->len - 1;
> > - i2c_writel(i2c_dev, packet_header, I2C_TX_FIFO);
> > + if (dma && !i2c_dev->msg_read)
> > + *buffer++ = packet_header;
> > + else
> > + i2c_writel(i2c_dev, packet_header, I2C_TX_FIFO);
> >
> > packet_header = I2C_HEADER_IE_ENABLE;
> > if (end_state == MSG_END_CONTINUE)
> > @@ -778,30 +1018,79 @@ static int tegra_i2c_xfer_msg(struct
> > tegra_i2c_dev *i2c_dev, packet_header |= I2C_HEADER_CONT_ON_NAK;
> > if (msg->flags & I2C_M_RD)
> > packet_header |= I2C_HEADER_READ;
> > - i2c_writel(i2c_dev, packet_header, I2C_TX_FIFO);
> > -
> > - if (!(msg->flags & I2C_M_RD))
> > - tegra_i2c_fill_tx_fifo(i2c_dev);
> > -
> > + if (dma && !i2c_dev->msg_read)
> > + *buffer++ = packet_header;
> > + else
> > + i2c_writel(i2c_dev, packet_header, I2C_TX_FIFO);
> > +
> > + if (!i2c_dev->msg_read) {
> > + if (dma) {
> > + memcpy(buffer, msg->buf, msg->len);
> > + dma_sync_single_for_device(i2c_dev->dev,
> > +
> > i2c_dev->dma_phys,
> > + xfer_size,
> > +
> > DMA_TO_DEVICE);
>
> Again, here we properly flush the caches to make sure the data that
> we've written to the DMA buffer is visible to the DMA engine.
>

+1 this is correct

> > + err = tegra_i2c_dma_submit(i2c_dev,
> > xfer_size);
> > + if (err < 0) {
> > + dev_err(i2c_dev->dev,
> > + "starting TX DMA failed,
> > err %d\n",
> > + err);
> > + goto unlock;
> > + }
> > + } else {
> > + tegra_i2c_fill_tx_fifo(i2c_dev);
> > + }
> > + }
> > if (i2c_dev->hw->has_per_pkt_xfer_complete_irq)
> > int_mask |= I2C_INT_PACKET_XFER_COMPLETE;
> > - if (msg->flags & I2C_M_RD)
> > - int_mask |= I2C_INT_RX_FIFO_DATA_REQ;
> > - else if (i2c_dev->msg_buf_remaining)
> > - int_mask |= I2C_INT_TX_FIFO_DATA_REQ;
> > + if (!dma) {
> > + if (msg->flags & I2C_M_RD)
> > + int_mask |= I2C_INT_RX_FIFO_DATA_REQ;
> > + else if (i2c_dev->msg_buf_remaining)
> > + int_mask |= I2C_INT_TX_FIFO_DATA_REQ;
> > + }
> >
> > tegra_i2c_unmask_irq(i2c_dev, int_mask);
> > - spin_unlock_irqrestore(&i2c_dev->xfer_lock, flags);
> > dev_dbg(i2c_dev->dev, "unmasked irq: %02x\n",
> > i2c_readl(i2c_dev, I2C_INT_MASK));
> >
> > +unlock:
> > + spin_unlock_irqrestore(&i2c_dev->xfer_lock, flags);
> > +
> > + if (dma) {
> > + if (err)
> > + return err;
> > +
> > + time_left = wait_for_completion_timeout(
> > +
> > &i2c_dev->dma_complete,
> > + TEGRA_I2C_TIMEOUT);
> > +
> > + if (time_left == 0) {
> > + dev_err(i2c_dev->dev, "DMA transfer
> > timeout\n");
> > + dmaengine_terminate_all(chan);
> > + tegra_i2c_init(i2c_dev);
> > + return -ETIMEDOUT;
> > + }
> > +
> > + if (i2c_dev->msg_read) {
> > + if (likely(i2c_dev->msg_err ==
> > I2C_ERR_NONE)) {
> > +
> > dma_sync_single_for_cpu(i2c_dev->dev,
> > +
> > i2c_dev->dma_phys,
> > + xfer_size,
> > +
> > DMA_FROM_DEVICE);
>
> Here we invalidate the caches to make sure we don't get stale data
> that may be in the caches for data that we're copying out of the DMA
> buffer. I think that's about all the cache maintenance that we
> real
> need.

Correct.

And technically here should be dma_sync_single_for_cpu(DMA_TO_DEVICE)
for the TX. But again, it's a NO-OP.