Re: [PATCH v2 1/1] dmaengine: switchtec-dma: Introduce Switchtec DMA engine PCI driver

From: Kelvin.Cao
Date: Thu Apr 13 2023 - 15:00:18 EST


On Tue, 2023-04-11 at 09:47 -0600, Logan Gunthorpe wrote:
> > EXTERNAL EMAIL: Do not click links or open attachments unless you >
> > know the content is safe
> >
> > On 2023-04-10 10:42, Christoph Hellwig wrote:
> > > > On Mon, Apr 03, 2023 at 11:06:28AM -0700, Kelvin Cao wrote:
> > > > > > +#define HALT_RETRY 100
> > > > > > +static int halt_channel(struct switchtec_dma_chan
> > > > > > *swdma_chan)
> > > > > > +{
> > > > > > +    u32 status;
> > > > > > +    struct chan_hw_regs __iomem *chan_hw = > > >
> > > > > > swdma_chan->mmio_chan_hw;
> > > > > > +    int retry = HALT_RETRY;
> > > > > > +    struct pci_dev *pdev;
> > > > > > +    int ret;
> > > > > > +
> > > > > > +    rcu_read_lock();
> > > > > > +    pdev = rcu_dereference(swdma_chan->swdma_dev->pdev);
> > > > > > +    if (!pdev) {
> > > > > > +            ret = -ENODEV;
> > > > > > +            goto unlock_and_exit;
> > > > > > +    }
> > > >
> > > > This whole RCU critical section around every access to ->pdev >
> > > > > scheme
> > > > looks a bit bothersome to me.  This means that all the low-
> > > > level
> > > > PCI ops are done in RCU critical section.  Is this something
> > > > you
> > > > came up with or is it copied from other drivers?
> >
> > I suspect they copied it from plx_dma driver that I wrote ;(,
> > though
> > that driver uses rcu_dereference a bit more sparingly (only on
> > stop,
> > issue_pending and when allocating and freeing a channel).
> >

Thanks Logan for jumping in. The RCU stuff was copied from plx_dma
driver.

> > > > Normally we'd do an unregistration from the dmaengine subsystem
> > > > first, which might do a RCU synchronization at a high level,
> > > > and then we're sure that none of the methods gets called again
> > > > on the unregistered device.
> > > >
> > > > Can't this driver (and the dmaengine core) support an operation
> > > > mode where you set a shutdown flag at the beginning
> >
> > The dmaengine code didn't support hot unplug at all. I believe most
> > drivers are likely to crash if this happens. When I wrote the plx-
> > dma
> > engine, I had to make a bunch of changes to dmaengine just so the
> > framework didn't crash when I tested this. The framework is pretty
> > > thin,
> > so there's not much to synchronize on to indicate other threads are
> > > not
> > in the middle of issuing new IO when a flag is set.

A bit more history, this is the unbind issue Logan spotted and fixed:
https://lore.kernel.org/all/20191216190120.21374-1-logang@xxxxxxxxxxxx/
> >
> > > > > > +    tasklet_schedule(&swdma_dev->chan_status_task);
> > > >
> > > > What speaks against simply using threaded irqs here instead of
> > > > the
> > > > tasklets?
> >
> > Almost all the dmaengine drivers use tasklets. I don't know if this
> > > is
> > the best approach, but my understanding was that it was due to >
> > needing
> > low latency in processing the completed descriptors, otherwise it
> > can > be
> > hard to reach the full bandwidth of the dma engine.
> >
> > Logan