Re: [PATCH v9 13/13] media: ti: Add CSI2RX support for J721E

From: Laurent Pinchart
Date: Tue Aug 29 2023 - 11:55:49 EST


Hi Jai,

(CC'ing Vinod, the maintainer of the DMA engine subsystem, for a
question below)

On Fri, Aug 18, 2023 at 03:55:06PM +0530, Jai Luthra wrote:
> On Aug 15, 2023 at 16:00:51 +0300, Tomi Valkeinen wrote:
> > On 11/08/2023 13:47, Jai Luthra wrote:
> > > From: Pratyush Yadav <p.yadav@xxxxxx>

[snip]

> > > +static int ti_csi2rx_start_streaming(struct vb2_queue *vq, unsigned int count)
> > > +{
> > > + struct ti_csi2rx_dev *csi = vb2_get_drv_priv(vq);
> > > + struct ti_csi2rx_dma *dma = &csi->dma;
> > > + struct ti_csi2rx_buffer *buf;
> > > + unsigned long flags;
> > > + int ret = 0;
> > > +
> > > + spin_lock_irqsave(&dma->lock, flags);
> > > + if (list_empty(&dma->queue))
> > > + ret = -EIO;
> > > + spin_unlock_irqrestore(&dma->lock, flags);
> > > + if (ret)
> > > + return ret;
> > > +
> > > + dma->drain.len = csi->v_fmt.fmt.pix.sizeimage;
> > > + dma->drain.vaddr = dma_alloc_coherent(csi->dev, dma->drain.len,
> > > + &dma->drain.paddr, GFP_KERNEL);
> > > + if (!dma->drain.vaddr)
> > > + return -ENOMEM;
> >
> > This is still allocating a large buffer every time streaming is started (and
> > with streams support, a separate buffer for each stream?).
> >
> > Did you check if the TI DMA can do writes to a constant address? That would
> > be the best option, as then the whole buffer allocation problem goes away.
>
> I checked with Vignesh, the hardware can support a scenario where we
> flush out all the data without allocating a buffer, but I couldn't find
> a way to signal that via the current dmaengine framework APIs. Will look
> into it further as it will be important for multi-stream support.

That would be the best option. It's not immediately apparent to me if
the DMA engine API supports such a use case.
dmaengine_prep_interleaved_dma() gives you finer grain control on the
source and destination increments, but I haven't seen a way to instruct
the DMA engine to direct writes to /dev/null (so to speak). Vinod, is
this something that is supported, or could be supported ?

> > Alternatively, can you flush the buffers with multiple one line transfers?
> > The flushing shouldn't be performance critical, so even if that's slower
> > than a normal full-frame DMA, it shouldn't matter much. And if that can be
> > done, a single probe time line-buffer allocation should do the trick.
>
> There will be considerable overhead if we queue many DMA transactions
> (in the order of 1000s or even 100s), which might not be okay for the
> scenarios where we have to drain mid-stream. Will have to run some
> experiments to see if that is worth it.
>
> But one optimization we can for sure do is re-use a single drain buffer
> for all the streams. We will need to ensure to re-allocate the buffer
> for the "largest" framesize supported across the different streams at
> stream-on time.

If you implement .device_prep_interleaved_dma() in the DMA engine driver
you could write to a single line buffer, assuming that the hardware would
support so in a generic way.

> My guess is the endpoint is not buffering a full-frame's worth of data,
> I will also check if we can upper bound that size to something feasible.
>
> > Other than this drain buffer topic, I think this looks fine. So, I'm going
> > to give Rb, but I do encourage you to look more into optimizing this drain
> > buffer.
>
> Thank you!
>
> > Reviewed-by: Tomi Valkeinen <tomi.valkeinen@xxxxxxxxxxxxxxxx>

--
Regards,

Laurent Pinchart