Re: [PATCH v1 8/9] PCI: PLDA: starfive: Add JH7110 PCIe controller

From: Bjorn Helgaas
Date: Thu Jul 20 2023 - 12:16:03 EST


On Thu, Jul 20, 2023 at 06:11:59PM +0800, Kevin Xie wrote:
> On 2023/7/20 0:48, Bjorn Helgaas wrote:
> > On Wed, Jul 19, 2023 at 06:20:56PM +0800, Minda Chen wrote:
> >> Add StarFive JH7110 SoC PCIe controller platform
> >> driver codes.

> >> + * The BAR0/1 of bridge should be hidden during enumeration to
> >> + * avoid the sizing and resource allocation by PCIe core.
> >> + */
> >> +static bool starfive_pcie_hide_rc_bar(struct pci_bus *bus, unsigned int devfn,
> >> + int offset)
> >> +{
> >> + if (pci_is_root_bus(bus) && !devfn &&
> >> + (offset == PCI_BASE_ADDRESS_0 || offset == PCI_BASE_ADDRESS_1))
> >> + return true;
> >> +
> >> + return false;
> >> +}
> >> +
> >> +int starfive_pcie_config_write(struct pci_bus *bus, unsigned int devfn,
> >> + int where, int size, u32 value)
> >> +{
> >> + if (starfive_pcie_hide_rc_bar(bus, devfn, where))
> >> + return PCIBIOS_BAD_REGISTER_NUMBER;
> >
> > I think you are trying present BARs 0 & 1 as unimplemented. Such BARs
> > are hardwired to zero, so you should make them behave that way (both
> > read and write). Many callers of config accessors don't check the
> > return value, so I don't think it's reliable to just return
> > PCIBIOS_BAD_REGISTER_NUMBER.
>
> This is a hardware defect that we did not hardwired those BARs to
> zero, and it is configurable for software now. We have to add this
> filter function for workaround.

Yes. My point is that this only affects the write path, and the read
probably does not read 0 as it should. This means lspci will show the
wrong thing, and the PCI core will try to size the BAR when it doesn't
need to. I haven't looked at the BAR sizing code; it might even come
up with a bogus size and address, when it *should* just conclude the
BAR doesn't exist at all.

> >> + /* Ensure that PERST has been asserted for at least 100 ms */
> >> + msleep(300);
> >> + gpiod_set_value_cansleep(pcie->reset_gpio, 0);
> >
> > At least 100 ms, but you sleep *300* ms? This is probably related to
> > https://lore.kernel.org/r/20230718155515.GA483233@bhelgaas
> >
> > Please include a comment with the source of the delay value. I assume
> > it's T_PVPERL and T_PERST-CLK from the PCIe CEM spec. This way we can
> > someday share those #defines across drivers.
>
> Yes, the delay value here is T_PVPERL from PCIe CEM spec r2.0 (Table
> 2-4). At the first time we set 100ms delay according to sector 2.2
> of the spec: "After there has been time (TPVPERL) for the power and
> clock to become stable, PERST# is deasserted high and the PCI
> Express functions can start up."
>
> However, in the compatibility testing with several NVMe SSD, we
> found that Lenovo Thinklife ST8000 NVMe can not get ready in 100ms,
> and it actually needs almost 200ms. Thus, we increased the T_PVPERL
> value to 300ms for the better device compatibility.
>
> We will use a macro to define T_PVPERL, and add comments for the
> source of it. If the compatibility delay of 300ms is not
> reasonable, we can revert it to 100ms.

Thanks for this valuable information! This NVMe issue potentially
affects many similar drivers, and we may need a more generic fix so
this device works well with all of them.

T_PVPERL is defined to start when power is stable. Do you have a way
to accurately determine that point? I'm guessing this:

gpiod_set_value_cansleep(pcie->power_gpio, 1)

turns the power on? But of course that doesn't mean it is instantly
stable. Maybe your testing is telling you that your driver should
have a hardware-specific 200ms delay to wait for power to become
stable, followed by the standard 100ms for T_PVPERL?

Bjorn