RE: [PATCH V5 3/3] PCI: xilinx-xdma: Add Xilinx XDMA Root Port driver

From: Havalige, Thippeswamy
Date: Thu Jul 20 2023 - 02:38:58 EST


Hi Bjorn,

> -----Original Message-----
> From: Bjorn Helgaas <helgaas@xxxxxxxxxx>
> Sent: Saturday, July 1, 2023 4:49 AM
> To: Havalige, Thippeswamy <thippeswamy.havalige@xxxxxxx>
> Cc: krzysztof.kozlowski@xxxxxxxxxx; devicetree@xxxxxxxxxxxxxxx; linux-
> pci@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; robh+dt@xxxxxxxxxx;
> bhelgaas@xxxxxxxxxx; lorenzo.pieralisi@xxxxxxx; linux-arm-
> kernel@xxxxxxxxxxxxxxxxxxx; Gogada, Bharat Kumar
> <bharat.kumar.gogada@xxxxxxx>; Simek, Michal
> <michal.simek@xxxxxxx>
> Subject: Re: [PATCH V5 3/3] PCI: xilinx-xdma: Add Xilinx XDMA Root Port driver
>
> On Wed, Jun 28, 2023 at 02:58:12PM +0530, Thippeswamy Havalige wrote:
> > Add support for Xilinx XDMA Soft IP core as Root Port.
> > ...
>
> > |Reported-by: kernel test robot <lkp@xxxxxxxxx>
> > |Reported-by: Dan Carpenter <error27@xxxxxxxxx>
> > |Closes:
> > |https://lore.kernel.org/r/202305261250.2cs1phTS-lkp@xxxxxxxxx/
>
> Remove these. I mentioned this before:
> https://lore.kernel.org/r/ZHd/7AaLaGyr1jNA@bhelgaas
- Agreed, I'll remove this in next patch
> > + * struct pl_dma_pcie - PCIe port information
> > + * @dev: Device pointer
> > + * @reg_base: IO Mapped Register Base
> > + * @irq: Interrupt number
> > + * @cfg: Holds mappings of config space window
> > + * @phys_reg_base: Physical address of reg base
> > + * @intx_domain: Legacy IRQ domain pointer
> > + * @pldma_domain: PL DMA IRQ domain pointer
> > + * @resources: Bus Resources
> > + * @msi: MSI information
> > + * @irq_misc: Legacy and error interrupt number
> > + * @intx_irq: legacy interrupt number
> > + * @lock: lock protecting shared register access
>
> Capitalize the intx_irq and lock descriptions so they match the others.
- Agreed, I'll fix it in the next patch
> "Legacy and error interrupt number" and "legacy interrupt number"
> sound like they overlap -- "legacy interrupt number" is part of both.
> Is that an error?
- Agreed, I'll modify this comment to legacy interrupt number. (This irq line is for both legacy interrupts and error interrupt bits)
> > +static bool xilinx_pl_dma_pcie_valid_device(struct pci_bus *bus,
> > +unsigned int devfn) {
> > + struct pl_dma_pcie *port = bus->sysdata;
> > +
> > + /* Check if link is up when trying to access downstream ports */
> > + if (!pci_is_root_bus(bus)) {
> > + /*
> > + * If the link goes down after we check for link-up, we have a
> problem:
> > + * if a PIO request is initiated while link-down, the whole
> controller
> > + * hangs, and even after link comes up again, previous PIO
> requests
> > + * won't work, and a reset of the whole PCIe controller is
> needed.
> > + * Henceforth we need link-up check here to avoid sending
> PIO request
> > + * when link is down.
>
> Wrap this comment so it fits in 80 columns like the rest of the file.
>
> I think the comment was added because I pointed out that this is racy.
> Obviously the comment doesn't *fix* the race, and it actually doesn't even
> describe the race.
- Agreed, I'll add comments regarding race condition.
> Even with the xilinx_pl_dma_pcie_link_up() check, this is racy because
> xilinx_pl_dma_pcie_link_up() may tell you the link is up, but the link may go
> down before the driver attempts the config transaction. THAT is the race.
>
> If the controller hangs in that situation, that's a hardware defect, and from
> your comment, it sounds like it's unrecoverable.
>
> > + */
> > + if (!xilinx_pl_dma_pcie_link_up(port))
> > + return false;
>
> > +static int xilinx_pl_dma_pcie_intx_map(struct irq_domain *domain,
> unsigned int irq,
> > + irq_hw_number_t hwirq)
>
> Wrap to fit in 80 columns like the rest of the file.
>
> > +static struct irq_chip xilinx_msi_irq_chip = {
> > + .name = "pl_dma_pciepcie:msi",
>
> Why does this name have two copies of "pcie" in it? This driver has four
> irq_chip structs; maybe the names could be more similar?
- Agreed, I'll modify all irq_chip names this in next patch
Example:
static struct irq_chip xilinx_msi_irq_chip = {
.name = "pl_dma:PCIe MSI",

> xilinx_leg_irq_chip INTx
> xilinx_msi_irq_chip pl_dma_pciepcie:msi
> xilinx_irq_chip Xilinx MSI
> xilinx_pl_dma_pcie_event_irq_chip RC-Event
>
> > + /* Plug the INTx chained handler */
> > + irq_set_chained_handler_and_data(port->intx_irq,
> > + xilinx_pl_dma_pcie_intx_flow, port);
> > +
> > + /* Plug the main event chained handler */
> > + irq_set_chained_handler_and_data(port->irq,
> > + xilinx_pl_dma_pcie_event_flow,
> port);
>
> What's the reason for using chained IRQs? Can this be done without them? I
> don't claim to understand all the issues here, but it seems better to avoid
> chained IRQ handlers when possible:
- As per the comments in this https://lkml.kernel.org/lkml/alpine.DEB.2.20.1705232307330.2409@nanos/T/
"It is fine to have chained interrupts when bootloader, device tree and kernel under control. Only if BIOS/UEFI comes into
play the user is helpless against interrupt storm which will cause system to hangs."

We are using ARM embedded platform with Bootloader, Devicetree flow.

> https://lore.kernel.org/all/877csohcll.ffs@tglx/
>
> > + /*set the Bridge enable bit */
- Agreed, I ll modify it in next patch.
> Space before "Set". I mentioned this before at
> https://lore.kernel.org/r/ZHd/7AaLaGyr1jNA@bhelgaas
>
> > + res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > + if (!res) {
> > + dev_err(dev, "missing \"reg\" property\n");
>
> All your other error messages are capitalized. Make this one match.
>
> > + bridge->ops = (struct pci_ops *)&xilinx_pl_dma_pcie_ops.pci_ops;
>
> I don't think this cast is needed.
-Agreed, will modify it in next patch.
>
> Bjorn