Re: [PATCH v12 0/6] iommu/dma: s390 DMA API conversion and optimized IOTLB flushing

From: Niklas Schnelle
Date: Wed Sep 27 2023 - 10:42:24 EST


On Wed, 2023-09-27 at 15:20 +0200, Niklas Schnelle wrote:
> On Wed, 2023-09-27 at 13:24 +0200, Niklas Schnelle wrote:
> > On Wed, 2023-09-27 at 11:55 +0200, Joerg Roedel wrote:
> > > Hi Niklas,
> > >
> > > On Wed, Sep 27, 2023 at 10:55:23AM +0200, Niklas Schnelle wrote:
> > > > The problem is that something seems to be broken in the iommu/core
> > > > branch. Regardless of whether I have my DMA API conversion on top or
> > > > with the base iommu/core branch I can not use ConnectX-4 VFs.
> > >
> > > Have you already tried to bisect the issue in the iommu/core branch?
> > > The result might sched some light on the issue.
> > >
> > > Regards,
> > >
> > > Joerg
> >
> > Hi Joerg,
> >
> > Working on it, somehow I must have messed up earlier. It now looks like
> > it might in fact be caused by my DMA API conversion rebase and the
> > "s390/pci: Use dma-iommu layer" commit. Maybe there is some interaction
> > with Jason's patches that I haven't thought about. So sorry for any
> > wrong blame.
> >
> > Thanks,
> > Niklas
>
> Hi,
>
> I tracked the problem down from mlx5_core's alloc_cmd_page() via
> dma_alloc_coherent(), ops->alloc, iommu_dma_alloc_remap(), and
> __iommu_dma_alloc_noncontiguous() to a failed iommu_dma_alloc_iova().
> The allocation here is for 4K so nothing crazy.
>
> On second look I also noticed:
>
> nvme 2007:00:00.0: Using 42-bit DMA addresses
>
> for the NVMe that is working. The problem here seems to be that we set
> iommu_dma_forcedac = true in s390_iommu_probe_finalize() because we
> have currently have a reserved region over the first 4 GiB anyway so
> will always use IOVAs larger than that. That however is too late since
> iommu_dma_set_pci_32bit_workaround() is already checked in
> __iommu_probe_device() which is called just before ops-
> > probe_finalize(). So I moved setting iommu_dma_forcedac = true to
> zpci_init_iommu() and that gets rid of the notice for the NVMe but I
> still get a failure of iommu_dma_alloc_iova() in
> __iommu_dma_alloc_noncontiguous(). So I'll keep digging.
>
> Thanks,
> Niklas


Ok I think I got it and this doesn't seem strictly s390x specific but
I'd think should happen with iommu.forcedac=1 everywhere.

The reason iommu_dma_alloc_iova() fails seems to be that mlx5_core does
dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64)) in 
mlx5_pci_init()->set_dma_caps() which happens after it already called
mlx5_mdev_init()->mlx5_cmd_init()->alloc_cmd_page() so for the
dma_alloc_coherent() in there the dev->coherent_dma_mask is still
DMA_BIT_MASK(32) for which we can't find an IOVA because well we don't
have IOVAs below 4 GiB. Not entirely sure what caused this not to be
enforced before.

Thanks,
Niklas