Re: [PATCH v5 1/2] PCI: dwc: Drop dependency on ZONE_DMA32

From: William McVicker
Date: Mon Oct 03 2022 - 12:42:02 EST


On 10/03/2022, Lorenzo Pieralisi wrote:
> On Fri, Sep 30, 2022 at 10:02:22AM -0700, William McVicker wrote:
>
> [...]
>
> > > > So now we have what we have. The DMA-mask is pointlessly changed for
> > > > something not really implying any DMA. That's why I insisted on
> > > > dropping it at the very least. Another reason I thought was also
> > > > appropriate was the default DMA-mask being set to 32bits anyway.
> > > > But you said we shouldn't rely on the default DMA-mask setting. So
> > > > ok, it doesn't count then. But it doesn't change the uselessness of the
> > > > DMA-mask change in the current driver.
> > >
> > > As I keep saying, it *is* relevant to DMA. The MSI doorbell may not be
> > > accessing memory, but it is still a thing that occupies DMA address space
> > > like a mapping of memory does, and DMA masks are how we control how DMA
> > > address space is allocated. Unless and until we have an API for arbitrarily
> > > reserving DMA address space within a given range, the approach used here and
> > > in other drivers is the next best thing, however much you don't like it.
> > >
> > > > > AFAIU the correct PCI device
> > > > > won't actually exist until we've got far enough through pci_host_probe(), so
> > > > > I'm not sure how to easily solve this :/
> > > >
> > > > Walk over all PCIe devices detected on the PCIe-bus. Check their
> > > > MSI-capability flags. If any of them have no 64-bit MSI flag set, then
> > > > make sure the MSI-base address is allocated within 4GB memory region.
> > > > It isn't that easy to implement though...
> > >
> > > It has nothing to do with capabilities (but also: consider hotplug). We
> > > simply need the host bridge PCI device to pass to the DMA API to ensure that
> > > the mapping/allocation is relative to PCI Mem space rather than system
> > > physical address space, because the two don't have to be identical. The
> > > challenge is how to reliably pick up that device and set up the doorbell
> > > *before* any other PCI devices also discovered by pci_host_probe() have a
> > > chance to start binding drivers and trying to request MSIs.
> >
> > Maybe I can provide some more insights on my patches that may help you guys
> > understand the idea behind the MSI capabilities flag. Basically, on a given
> > Android phone there are going to be multiple PCIe endpoints -- wifi and modem
> > are good examples. Some of these PCIe endpoints may only support a 32-bit MSI
> > capability structure, but others could support a 64-bit MSI capability
> > structure. So my intent was to have the PCIe RC driver detect the endpoint
> > device's MSI capabilites during the EP device probe and then set the
> > PCI_MSI_FLAGS_64BIT accordingly before calling dw_pcie_host_init(). Since the
> > PCIe RC drivers are the ones to call dw_pcie_host_init(), we can dynamically
> > change the DMA mask and allocate the doorbell target address based on the
> > PCI_MSI_FLAGS_64BIT.
>
> It seems to me we are all talking past each others to solve different
> problems, so I am going to reset this discussion given that we are
> in the merge window and I must finalise the PCI patch queue.
>
> Patch (2/2) should be dropped IMO - I don't think the host bridge
> platform device DMA mask should depend on the root port MSI cap
> 64/32 bit addressing - I don't think that's the right thing to do.
>
> We should keep this discussion going for the next cycle, I will
> drop patch (2/2) for the time being, sorry.

Thanks Lorenzo for taking patch 1/2. That solves my immediate issues. Let's
solve Serge's eDMA problem first. Then we can revisit the 64-bit MSI target
address issue.

Regards,
Will

>
> Thanks,
> Lorenzo
>
> > This all really gets a lot more complicated with the introduction of Serge's
> > eDMA device. So I full well expect dw_pcie_msi_host_init() to be re-factored
> > for that.
> >
> > >
> > > > > Of course *this* patch doesn't change any of that either, so it's no worse
> > > > > than the existing code and I don't see that dropping it helps you at all;
> > > > > the current driver is already trampling your 64-bit mask back to 32 bits
> > > >
> > > > Yes, and by this pathset @William intend to fix the DMA-mask-override
> > > > behaviour by using the dma_alloc_coherent() method.
> > >
> > > No, that is effectively unchanged. Whether it's a streaming mapping with
> > > dma_mask or a coherent allocation with coherent_dma_mask, masks are getting
> > > set way, it's the fact that it's on the wrong device that's the real
> > > problem.
> > >
> > > If you expose the eDMA as a generic dmaengine device then there's every
> > > chance some consumer would make a streaming mapping with it, so even though
> > > the current code happens to not override the coherent mask it's still biting
> > > you in the streaming mask.
> > >
> > > > So any
> > > > platform-specific DMA-mask setting will be overwritten, and the
> > > > DMA-mask setting won't be able to be moved/dropped due to the
> > > > dma_alloc_coherent() method usage.
> > > >
> > > > I have added a DW eDMA-engine support to the DW PCIe driver (you've
> > > > already seen my patches) and the engine initialization is supposed to
> > > > be performed after any basic initializations like CSRs mapping, data
> > > > allocations, MSI, etc. Since DMA is performed by the controller itself
> > > > it's required to have a correct DMA-mask set to the DW PCIe platform
> > > > device otherwise any consequent mapping will be bounce buffered to the
> > > > lowest 4GB even though the corresponding platform can support more
> > > > than 4GB of memory (even our MIPS32 can) with DW eDMA easily reaching
> > > > that memory. What would help me in this case if the MSI-buffer
> > > > allocation procedure wouldn't change the device DMA-mask. As an
> > > > alternative to completely dropping the DMA-mask setting, the DMA-mask
> > > > setup process could be moved to the low-level platform device drivers.
> > > > It would be even more justified since the platform-specific device
> > > > capabilities (like DW PCIe AXI-interface address-bus width) are
> > > > unknown in the generic code.
> > > >
> > > > As another alternative I could override the DMA-mask within the DW
> > > > eDMA probe procedure. But that would make things more complicated than
> > > > relying on the low-level platform drivers doing that.
> > > >
> > > > > and
> > > > > reserving the doorbell address in the wrong DMA address space (modulo the
> > > > > other dma-ranges bug which also took far too long to figure out).
> > > >
> > > > Actually current driver (without William patch) reserve the doorbell
> > > > address in the correct DMA address space (if we don't take the
> > > > dma-ranges settings into account).
> > >
> > > No it does not. With or without this patch it is still passing the *platform
> > > device* to the DMA API, which means the mapping is relative to the platform
> > > address space, not PCI Mem space on the other side of the iATU. The fact
> > > that the iATU's dma-ranges translation is erroneously applied to the
> > > platform device at the moment is, as I have said, a bug.
> >
> > Thanks for pointing this out. I agree this is a bug and I guess it hasn't
> > really been a problem because there isn't really any DMA'ing going on. With the
> > new eDMA device being introduced, this bug will likely need to be fixed.
> >
> > >
> > > > It works as expected in case if the
> > > > PCIe<->CPU space has one-on-one mapping (which is true in the most of
> > > > the cases). The only thing which is wrong is the pointless DMA-mask
> > > > update. I could have easily dropped it in my patchset. But after the
> > > > @William patchset is applied I won't be able to do that due to using
> > > > the dma_alloc_coherent() here.
> > >
> > > Once again, it is not pointless. There is no guarantee that __GFP_DMA32 does
> > > anything, since ZONE_DMA32 may not exist (per this patch), and the zones may
> > > not be as expected anyway (look at what arm64 currently does if all RAM is
> > > above 32 bits, but save those complaints for another thread).
> > >
> > > > > At this
> > > > > point I'd rather keep it since getting rid of the __GFP_DMA32 abuse is
> > > > > objectively good. If losing one page of coherent memory is a measurably
> > > > > significant problem for T1 once the other issues are worked out and that
> > > > > series lands, then you're welcome to propose a change on top (but I would
> > > > > prefer that all the drivers using this trick are changed consistently).
> > > >
> > > > Regarding DMA-coherent allocation. I am not happy with losing a whole
> > > > page of the dma-coherent memory, but we can live with that. What give
> > > > additional difficulty for our eDMA-patches is the DMA-mask override.
> > > > If you still insist on preserving the @William patchset as it is,
> > > > where do you suggest for me to update the DMA-mask if the low-level
> > > > driver won't be suitable for that anymore?
> > >
> > > I'm not insisting anything, it's just that this patch is already reviewed
> > > and queued, is a small step towards being less wrong overall, and dropping
> > > it wouldn't actually solve any of your problems anyway, so I just feel that
> > > being obstructive because it falls short of perfection isn't helpful.
> >
> > Thanks for the responses Robin! I agree that we don't have a complete solution
> > and need to fix this DMA address space bug, but dont think that's enough of
> > a reason to drop this patch series. At the very least I think patch 1/2 which
> > removes the ZONE_DMA32 dependency is a worthy patch to take for 6.1.
> >
> > Thanks,
> > Will
> >
> > >
> > > Robin.