Re: [PATCH] iommu/s390: Fix race with release_device ops

From: Niklas Schnelle
Date: Thu Aug 25 2022 - 07:12:19 EST


On Thu, 2022-08-25 at 09:22 +0200, Alexander Gordeev wrote:
> On Wed, Aug 24, 2022 at 04:25:19PM -0400, Matthew Rosato wrote:
> > > > @@ -90,15 +90,39 @@ static int s390_iommu_attach_device(struct iommu_domain *domain,
> > > > struct zpci_dev *zdev = to_zpci_dev(dev);
> > > > struct s390_domain_device *domain_device;
> > > > unsigned long flags;
> > > > - int cc, rc;
> > > > + int cc, rc = 0;
> > > > if (!zdev)
> > > > return -ENODEV;
> > > > + /* First check compatibility */
> > > > + spin_lock_irqsave(&s390_domain->list_lock, flags);
> > > > + /* First device defines the DMA range limits */
> > > > + if (list_empty(&s390_domain->devices)) {
> > > > + domain->geometry.aperture_start = zdev->start_dma;
> > > > + domain->geometry.aperture_end = zdev->end_dma;
> > > > + domain->geometry.force_aperture = true;
> > > > + /* Allow only devices with identical DMA range limits */
> > > > + } else if (domain->geometry.aperture_start != zdev->start_dma ||
> > > > + domain->geometry.aperture_end != zdev->end_dma) {
> > > > + rc = -EINVAL;
> > > > + }
> > > > + spin_unlock_irqrestore(&s390_domain->list_lock, flags);
> > > > + if (rc)
> > > > + return rc;
> > > > +
> > > > domain_device = kzalloc(sizeof(*domain_device), GFP_KERNEL);
> > > > if (!domain_device)
> > > > return -ENOMEM;
> > > > + /* Leave now if the device has already been released */
> > > > + spin_lock_irqsave(&zdev->dma_domain_lock, flags);
> > > > + if (!dev_iommu_priv_get(dev)) {
> > > > + spin_unlock_irqrestore(&zdev->dma_domain_lock, flags);
> > > > + kfree(domain_device);
> > > > + return 0;
> > > > + }
> > > > +
> > > > if (zdev->dma_table && !zdev->s390_domain) {
> > > > cc = zpci_dma_exit_device(zdev);
> > > > if (cc) {
> > >
> > > Am I wrong? It seems to me that zpci_dma_exit_device here is called with the spin_lock locked but this function zpci_dma_exit_device calls vfree which may sleep.
> > >
> >
> > Oh, good point, I just enabled lockdep to verify that.
> >
> > I think we could just replace this with a mutex instead, it's not a performance path. I've been running tests successfully today with this patch modified to instead use a mutex for dma_domain_lock.
>
> But your original version uses irq-savvy spinlocks.
> Are there data that need to be protected against interrupts?
>
> Thanks!

I think that was a carry over from my original attempt that used the
zdev->dma_domain_lock in some more places including in interrupt
context. I think these are gone now so I think Matt is right in his
version this can be a mutex.