Re: PCIe host controller behind IOMMU on ARM

From: Liviu.Dudau@xxxxxxx
Date: Wed Nov 11 2015 - 13:25:06 EST


On Mon, Nov 09, 2015 at 12:32:13PM +0000, Phil Edworthy wrote:
> Hi Liviu, Will,
>
> On 04 November 2015 15:19, Phil wrote:
> > On 04 November 2015 15:02, Liviu wrote:
> > > On Wed, Nov 04, 2015 at 02:48:38PM +0000, Phil Edworthy wrote:
> > > > Hi Liviu,
> > > >
> > > > On 04 November 2015 14:24, Liviu wrote:
> > > > > On Wed, Nov 04, 2015 at 01:57:48PM +0000, Phil Edworthy wrote:
> > > > > > Hi,
> > > > > >
> > > > > > I am trying to hook up a PCIe host controller that sits behind an IOMMU,
> > > > > > but having some problems.
> > > > > >
> > > > > > I'm using the pcie-rcar PCIe host controller and it works fine without
> > > > > > the IOMMU, and I can attach the IOMMU to the controller such that any
> > > calls
> > > > > > to dma_alloc_coherent made by the controller driver uses the
> > iommu_ops
> > > > > > version of dma_ops.
> > > > > >
> > > > > > However, I can't see how to make the endpoints to utilise the dma_ops
> > that
> > > > > > the controller uses. Shouldn't the endpoints inherit the dma_ops from the
> > > > > > controller?
> > > > >
> > > > > No, not directly.
> > > > >
> > > > > > Any pointers for this?
> > > > >
> > > > > You need to understand the process through which a driver for endpoint
> > get
> > > > > an address to be passed down to the device. Have a look at
> > > > > Documentation/DMA-API-HOWTO.txt, there is a nice explanation there.
> > > > > (Hint: EP driver needs to call dma_map_single).
> > > > >
> > > > > Also, you need to make sure that the bus address that ends up being set
> > into
> > > > > the endpoint gets translated correctly by the host controller into an address
> > > > > that the IOMMU can then translate into physical address.
> > > > Sure, though since this is bog standard Intel PCIe ethernet card which works
> > > > fine when the IOMMU is effectively unused, I donât think there is a problem
> > > > with that.
> > > >
> > > > The driver for the PCIe controller sets up the IOMMU mapping ok when I
> > > > do a test call to dma_alloc_coherent() in the controller's driver. i.e. when I
> > > > do this, it ends up in arm_iommu_alloc_attrs(), which calls
> > > > __iommu_alloc_buffer() and __alloc_iova().
> > > >
> > > > When an endpoint driver allocates and maps a dma coherent buffer it
> > > > also needs to end up in arm_iommu_alloc_attrs(), but it doesn't.
> > >
> > > Why do you think that? Remember that the only thing attached to the IOMMU
> > is
> > > the
> > > host controller. The endpoint is on the PCIe bus, which gets a different
> > > translation
> > > that the IOMMU knows nothing about. If it helps you to visualise it better, think
> > > of the host controller as another IOMMU device. It's the ops of the host
> > > controller
> > > that should be invoked, not the IOMMU's.
> > Ok, that makes sense. I'll have a think and poke it a bit more...

Hi Phil,

Not trying to ignore your email, but I thought this is more in Will's backyard.

> Somewhat related to this, since our PCIe controller HW is limited to
> 32-bit AXI address range, before trying to hook up the IOMMU I have
> tried to limit the dma_mask for PCI cards to DMA_BIT_MASK(32). The
> reason being that Linux uses a 1 to 1 mapping between PCI addresses
> and cpu (phys) addresses when there isn't an IOMMU involved, so I
> think that we need to limit the PCI address space used.

I think you're mixing things a bit or not explaining them very well. Having the
PCIe controller limited to 32-bit AXI does not mean that the PCIe bus cannot
carry 64-bit addresses. It depends on how they get translated by the host bridge
or its associated ATS block. I can't see why you can't have a setup where
the CPU addresses are 32-bit but the PCIe bus addresses are all 64-bit.
You just have to be careful on how you setup your mem64 ranges so that they don't
overlap with the 32-bit ranges when translated.

And no, you should not limit at the card driver the DMA_BIT_MASK() unless the
card is not capable of supporting more than 32-bit addresses.

>
> Since pci_setup_device() sets up dma_mask, I added a bus notifier in the
> PCIe controller driver so I can change the mask, if needed, on the
> BUS_NOTIFY_BOUND_DRIVER action.
> However, I think there is the potential for card drivers to allocate and
> map buffers before the bus notifier get called. Additionally, I've seen
> drivers change their behaviour based on the success or failure of
> dma_set_mask_and_coherent(dev, DMA_BIT_MASK(64)), so the
> driver could, theoretically at least, operate in a way that is not
> compatible with a more restricted dma_mask (though I can't think
> of any way this would not work with hardware I've seen).
>
> So, I think that using a bus notifier is the wrong way to go, but I donât
> know what other options I have. Any suggestions?

I would first have a look at how the PCIe bus addresses are translated by the
host controller.

Best regards,
Liviu

>
> Thanks for your help
> Phil

--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
Â\_(ã)_/Â
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/