RE: [PATCH] PCI : check if type 0 devices have all BARs of size zero

From: Wasim Khan
Date: Tue Feb 16 2021 - 02:53:35 EST


Hi Bjorn,


> -----Original Message-----
> From: Bjorn Helgaas <helgaas@xxxxxxxxxx>
> Sent: Tuesday, February 16, 2021 2:43 AM
> To: Wasim Khan (OSS) <wasim.khan@xxxxxxxxxxx>
> Cc: bhelgaas@xxxxxxxxxx; linux-pci@xxxxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx; Wasim Khan <wasim.khan@xxxxxxx>
> Subject: Re: [PATCH] PCI : check if type 0 devices have all BARs of size zero
>
> On Fri, Feb 12, 2021 at 11:08:56AM +0100, Wasim Khan wrote:
> > From: Wasim Khan <wasim.khan@xxxxxxx>
> >
> > Log a message if all BARs of type 0 devices are of size zero. This can
> > help detecting type 0 devices not reporting BAR size correctly.
>
> I could be missing something, but I don't think we can do this. I would think the
> simplest possible presilicon testing would find errors like this, and the first
> attempt to have a driver claim the device would fail if required BARs were
> missing, so I'm not sure what this would add.
>

Thank you for the review.
I observed this issue with an under development EP. Due to some logic problem in EP's firmware, the BAR sizes were reported zero and crash was observed sometime later in PCIe code.
I agree with you that such issues should have been caught in pre-silicon testing, but not sure of pre-si testing details and if the issue was specifically observed with real OS. Also, because the EP is in early stage of development, device driver of EP is not available as of now.
So, I though it will be a good idea to print an information message only for *type 0* devices to give a quick hint if the zero BAR size is expected for the given EP or not. So that SW can contribute to identify HW problem.

> While the subject line says "type 0 devices," this code path is also used for type
> 1 devices (bridges), and it's quite common for bridges to have no BARs, which
> means they would all be hardwired to zero.
>

Yes, for type 1 devices, it is common to have zero BAR size, so I added log msg for type 0 devices only , which are in-general expected to have valid BARs.


> It is also legal for even type 0 devices to implement no BARs. They may be
> operated entirely via config space or via device-specific BARs that are unknown
> to the PCI core.

OK, I did not know this . Thank you for sharing this.

>
> > Signed-off-by: Wasim Khan <wasim.khan@xxxxxxx>
> > ---
> > drivers/pci/probe.c | 5 +++++
> > 1 file changed, 5 insertions(+)
> >
> > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index
> > 953f15abc850..6438d6d56777 100644
> > --- a/drivers/pci/probe.c
> > +++ b/drivers/pci/probe.c
> > @@ -321,6 +321,7 @@ int __pci_read_base(struct pci_dev *dev, enum
> > pci_bar_type type, static void pci_read_bases(struct pci_dev *dev,
> > unsigned int howmany, int rom) {
> > unsigned int pos, reg;
> > + bool found = false;
> >
> > if (dev->non_compliant_bars)
> > return;
> > @@ -333,8 +334,12 @@ static void pci_read_bases(struct pci_dev *dev,
> unsigned int howmany, int rom)
> > struct resource *res = &dev->resource[pos];
> > reg = PCI_BASE_ADDRESS_0 + (pos << 2);
> > pos += __pci_read_base(dev, pci_bar_unknown, res, reg);
> > + found |= res->flags ? 1 : 0;
> > }
> >
> > + if (!dev->hdr_type && !found)
> > + pci_info(dev, "BAR size is 0 for BAR[0..%d]\n", howmany - 1);
> > +
> > if (rom) {
> > struct resource *res = &dev->resource[PCI_ROM_RESOURCE];
> > dev->rom_base_reg = rom;
> > --
> > 2.25.1
> >