Re: [PATCH v7 3/3] PCI: brcmstb: Set higher value for internal bus timeout

From: Bjorn Helgaas
Date: Thu Nov 09 2023 - 16:27:28 EST


On Thu, Nov 09, 2023 at 02:13:54PM -0500, Jim Quinlan wrote:
> During long periods of the PCIe RC HW being in an L1SS sleep state, there
> may be a timeout on an internal bus access, even though there may not be
> any PCIe access involved. Such a timeout will cause a subsequent CPU
> abort.
>
> Signed-off-by: Jim Quinlan <james.quinlan@xxxxxxxxxxxx>
> ---
> drivers/pci/controller/pcie-brcmstb.c | 16 ++++++++++++++++
> 1 file changed, 16 insertions(+)
>
> diff --git a/drivers/pci/controller/pcie-brcmstb.c b/drivers/pci/controller/pcie-brcmstb.c
> index f45c5d0168d3..f82a3e1a843a 100644
> --- a/drivers/pci/controller/pcie-brcmstb.c
> +++ b/drivers/pci/controller/pcie-brcmstb.c
> @@ -1031,6 +1031,21 @@ static int brcm_pcie_setup(struct brcm_pcie *pcie)
> return 0;
> }
>
> +/*
> + * This extends the timeout period for an access to an internal bus. This
> + * access timeout may occur during L1SS sleep periods, even without the
> + * presence of a PCIe access.
> + */
> +static void brcm_extend_rbus_timeout(struct brcm_pcie *pcie)
> +{
> + /* TIMEOUT register is two registers before RGR1_SW_INIT_1 */
> + const unsigned int REG_OFFSET = PCIE_RGR1_SW_INIT_1(pcie) - 8;
> + u32 timeout_us = 4000000; /* 4 seconds, our setting for L1SS */
> +
> + /* Each unit in timeout register is 1/216,000,000 seconds */
> + writel(216 * timeout_us, pcie->base + REG_OFFSET);
> +}
> +
> static void brcm_config_clkreq(struct brcm_pcie *pcie)
> {
> static const char err_msg[] = "invalid 'brcm,clkreq-mode' DT string\n";
> @@ -1067,6 +1082,7 @@ static void brcm_config_clkreq(struct brcm_pcie *pcie)
> * atypical and should happen only with older devices.
> */
> clkreq_cntl |= PCIE_MISC_HARD_PCIE_HARD_DEBUG_L1SS_ENABLE_MASK;
> + brcm_extend_rbus_timeout(pcie);

It looks like this should be squashed into the previous patch, which
added brcm_config_clkreq(). Otherwise there's a bisection hole where
somebody testing at the previous patch could see the CPU abort.

> } else {
> /*
> * "safe" -- No power savings; refclk is driven by RC