Re: [PATCH V5 3/3] PCI: xilinx-xdma: Add Xilinx XDMA Root Port driver

From: Bjorn Helgaas
Date: Fri Jun 30 2023 - 19:19:37 EST


On Wed, Jun 28, 2023 at 02:58:12PM +0530, Thippeswamy Havalige wrote:
> Add support for Xilinx XDMA Soft IP core as Root Port.
> ...

> |Reported-by: kernel test robot <lkp@xxxxxxxxx>
> |Reported-by: Dan Carpenter <error27@xxxxxxxxx>
> |Closes: https://lore.kernel.org/r/202305261250.2cs1phTS-lkp@xxxxxxxxx/

Remove these. I mentioned this before:
https://lore.kernel.org/r/ZHd/7AaLaGyr1jNA@bhelgaas

> + * struct pl_dma_pcie - PCIe port information
> + * @dev: Device pointer
> + * @reg_base: IO Mapped Register Base
> + * @irq: Interrupt number
> + * @cfg: Holds mappings of config space window
> + * @phys_reg_base: Physical address of reg base
> + * @intx_domain: Legacy IRQ domain pointer
> + * @pldma_domain: PL DMA IRQ domain pointer
> + * @resources: Bus Resources
> + * @msi: MSI information
> + * @irq_misc: Legacy and error interrupt number
> + * @intx_irq: legacy interrupt number
> + * @lock: lock protecting shared register access

Capitalize the intx_irq and lock descriptions so they match the others.

"Legacy and error interrupt number" and "legacy interrupt number"
sound like they overlap -- "legacy interrupt number" is part of both.
Is that an error?

> +static bool xilinx_pl_dma_pcie_valid_device(struct pci_bus *bus, unsigned int devfn)
> +{
> + struct pl_dma_pcie *port = bus->sysdata;
> +
> + /* Check if link is up when trying to access downstream ports */
> + if (!pci_is_root_bus(bus)) {
> + /*
> + * If the link goes down after we check for link-up, we have a problem:
> + * if a PIO request is initiated while link-down, the whole controller
> + * hangs, and even after link comes up again, previous PIO requests
> + * won't work, and a reset of the whole PCIe controller is needed.
> + * Henceforth we need link-up check here to avoid sending PIO request
> + * when link is down.

Wrap this comment so it fits in 80 columns like the rest of the file.

I think the comment was added because I pointed out that this is racy.
Obviously the comment doesn't *fix* the race, and it actually doesn't
even describe the race.

Even with the xilinx_pl_dma_pcie_link_up() check, this is racy because
xilinx_pl_dma_pcie_link_up() may tell you the link is up, but the link
may go down before the driver attempts the config transaction. THAT
is the race.

If the controller hangs in that situation, that's a hardware defect,
and from your comment, it sounds like it's unrecoverable.

> + */
> + if (!xilinx_pl_dma_pcie_link_up(port))
> + return false;

> +static int xilinx_pl_dma_pcie_intx_map(struct irq_domain *domain, unsigned int irq,
> + irq_hw_number_t hwirq)

Wrap to fit in 80 columns like the rest of the file.

> +static struct irq_chip xilinx_msi_irq_chip = {
> + .name = "pl_dma_pciepcie:msi",

Why does this name have two copies of "pcie" in it? This driver has
four irq_chip structs; maybe the names could be more similar?

xilinx_leg_irq_chip INTx
xilinx_msi_irq_chip pl_dma_pciepcie:msi
xilinx_irq_chip Xilinx MSI
xilinx_pl_dma_pcie_event_irq_chip RC-Event

> + /* Plug the INTx chained handler */
> + irq_set_chained_handler_and_data(port->intx_irq,
> + xilinx_pl_dma_pcie_intx_flow, port);
> +
> + /* Plug the main event chained handler */
> + irq_set_chained_handler_and_data(port->irq,
> + xilinx_pl_dma_pcie_event_flow, port);

What's the reason for using chained IRQs? Can this be done without
them? I don't claim to understand all the issues here, but it seems
better to avoid chained IRQ handlers when possible:
https://lore.kernel.org/all/877csohcll.ffs@tglx/

> + /*set the Bridge enable bit */

Space before "Set". I mentioned this before at
https://lore.kernel.org/r/ZHd/7AaLaGyr1jNA@bhelgaas

> + res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> + if (!res) {
> + dev_err(dev, "missing \"reg\" property\n");

All your other error messages are capitalized. Make this one match.

> + bridge->ops = (struct pci_ops *)&xilinx_pl_dma_pcie_ops.pci_ops;

I don't think this cast is needed.

Bjorn