Re: [PATCH v4 1/5] dt-bindings: PCI: brcmstb: brcm,{enable-l1ss,completion-timeout-us} props

From: Jim Quinlan
Date: Wed May 03 2023 - 10:39:26 EST


On Sun, Apr 30, 2023 at 3:10 PM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
>
> On Fri, Apr 28, 2023 at 06:34:55PM -0400, Jim Quinlan wrote:
> > This commit introduces two new properties:
>
> Doing two things makes this a candidate for splitting into two
> patches, as you've already done for the driver support. They seem
> incidentally related but not indivisible.
>
> > brcm,enable-l1ss (bool):
> >
> > The Broadcom STB/CM PCIe HW -- a core that is also used by RPi SOCs --
> > requires the driver probe() to deliberately place the HW one of three
> > CLKREQ# modes:
> >
> > (a) CLKREQ# driven by the RC unconditionally
> > (b) CLKREQ# driven by the EP for ASPM L0s, L1
> > (c) Bidirectional CLKREQ#, as used for L1 Substates (L1SS).
> >
> > The HW+driver can tell the difference between downstream devices that
> > need (a) and (b), but does not know when to configure (c). All devices
> > should work fine when the driver chooses (a) or (b), but (c) may be
> > desired to realize the extra power savings that L1SS offers. So we
> > introduce the boolean "brcm,enable-l1ss" property to inform the driver
> > that (c) is desired. Setting this property only makes sense when the
> > downstream device is L1SS-capable and the OS is configured to activate
> > this mode (e.g. policy==superpowersave).
>
> Is this related to the existing generic "supports-clkreq" property? I
> guess not, because supports-clkreq looks like a description of CLKREQ
> signal routing, while brcm,enable-l1ss looks like a description of
> what kind of downstream device is present?

It is related, I thought about using it, but not helpful for our needs. Both
cases (b) and (c) assume "supports-clkreq", and our HW needs to know
the difference between them. Further, we have a register that tells
us if the endpoint device has requested a CLKREQ#, so we already
have this information.

As an aside, I would think that the "supports-clkreq" property should be in
the port-driver or endpoint node.

>
> What bad things would happen if the driver always configured (c)?
Well, our driver has traditionally only supported (b) and our existing
boards have
been designed with this in mind. I would not want to switch modes w'o
the user/customer/engineer opting-in to do so.
Further, the PCIe HW engineer
told me defaulting to (c) was a bad idea and was "asking for trouble".
Note that the commit's comment has that warning about L1SS mode
not meeting this 400ns spec, and I suspect that many of our existing designs
have bumped into that.

But to answer your question, I haven't found a scenario that did not work
by setting mode (c). That doesn't mean they are not out there.

>
> Other platforms don't require this, and having to edit the DT based on
> what PCIe device is plugged in seems wrong. If brcmstb does need it,
> that suggests a hardware defect. If we need this to work around a
> defect, that's OK, but we should acknowledge the defect so we can stop
> using this for future hardware that doesn't need it.

All devices should work w/o the user having to change the DT. Only if they
desire L1SS must they add the "brcm,enable-l1ss" property.

Now there is this case where Cyril has found a regression, but recent
investigation
into this indicates that this particular failure was due to the RPi
CM4 using a "beta" eeprom
version -- after updating, it works fine.

>
> Maybe the name should be more specific to CLKREQ#, since this doesn't
> actually *enable* L1SS; apparently it's just one of the pieces needed
> to enable L1SS?

The other pieces are: (a) policy == POWERSUPERSAVE and (b) an L1SS-capable
device, which seem unrelated and are out of the scope of the driver.

The RPi Raspian folks have been using "brcm,enable-l1ss" for a while now and
I would prefer to keep that name for compatibility.

>
> > This property is already present in the Raspian version of Linux, but the
> > upstream driver implementaion that follows adds more details and discerns
> > between (a) and (b).
>
> s/implementaion/implementation/
>
> > brcm,completion-timeout-us (u32):
> >
> > Our HW will cause a CPU abort on any PCI transaction completion abort
> > error. It makes sense then to increase the timeout value for this type
> > of error in hopes that the response is merely delayed. Further,
> > L1SS-capable devices may have a long L1SS exit time and may require a
> > custom timeout value: we've been asked by our customers to make this
> > configurable for just this reason.
>
> I asked before whether this should be made generic and not
> brcm-specific, since completion timeouts are generic PCIe things. I
> didn't see any discussion, but Rob reviewed this so I guess it's OK
> as-is.
I am going to drop it, thanks for questioning its purpose and
I apologize for the noise.

Regards,
Jim Quinlan
Broadcom STB
>
> Is there something unique about brcm that requires this? I think it's
> common for PCIe Completion Timeouts to cause CPU aborts.
>
> Surely other drivers need to configure the completion timeout, but
> pcie-rcar-host.c and pcie-rcar-ep.c are the only ones I could find.
> Maybe the brcmstb power-up values are just too small? Does the
> correct value need to be in DT, or could it just be built into the
> driver?
>
> This sounds like something dependent on the downstream device
> connected, which again sounds hard for users to deal with. How would
> they know what to use here?
>
> > Signed-off-by: Jim Quinlan <jim2101024@xxxxxxxxx>
> > Reviewed-by: Rob Herring <robh@xxxxxxxxxx>
> > ---
> > .../devicetree/bindings/pci/brcm,stb-pcie.yaml | 16 ++++++++++++++++
> > 1 file changed, 16 insertions(+)
> >
> > diff --git a/Documentation/devicetree/bindings/pci/brcm,stb-pcie.yaml b/Documentation/devicetree/bindings/pci/brcm,stb-pcie.yaml
> > index 7e15aae7d69e..239cc95545bd 100644
> > --- a/Documentation/devicetree/bindings/pci/brcm,stb-pcie.yaml
> > +++ b/Documentation/devicetree/bindings/pci/brcm,stb-pcie.yaml
> > @@ -64,6 +64,22 @@ properties:
> >
> > aspm-no-l0s: true
> >
> > + brcm,enable-l1ss:
> > + description: Indicates that PCIe L1SS power savings
> > + are desired, the downstream device is L1SS-capable, and the
> > + OS has been configured to enable this mode. For boards
> > + using a mini-card connector, this mode may not meet the
> > + TCRLon maximum time of 400ns, as specified in 3.2.5.2.5
> > + of the PCI Express Mini CEM 2.0 specification.
> > + type: boolean
> > +
> > + brcm,completion-timeout-us:
> > + description: Number of microseconds before PCI transaction
> > + completion timeout abort is signalled.
> > + minimum: 16
> > + default: 1000000
> > + maximum: 19884107
> > +
> > brcm,scb-sizes:
> > description: u64 giving the 64bit PCIe memory
> > viewport size of a memory controller. There may be up to
> > --
> > 2.17.1
> >
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature