Re: [PATCH 2/8] bus: fsl-mc: handle DMA config deferral in ACPI case

From: Laurentiu Tudor
Date: Wed Nov 17 2021 - 08:03:18 EST


Hi Daniel,

Sorry for the late reply, please see some comments inline.

On 11/11/2021 7:23 PM, Daniel Thompson wrote:
> Hi Laurentiu
>
> On Thu, Jul 15, 2021 at 05:07:12PM +0300, laurentiu.tudor@xxxxxxx wrote:
>> From: Laurentiu Tudor <laurentiu.tudor@xxxxxxx>
>>
>> ACPI DMA configure API may return a defer status code, so handle it.
>> On top of this, move the MC firmware resume after the DMA setup
>> is completed to avoid crashing due to DMA setup not being done yet or
>> being deferred.
>>
>> Signed-off-by: Laurentiu Tudor <laurentiu.tudor@xxxxxxx>
>
> I saw regressions on my Honeycomb LX2 (NXP LX2060A) when I switched to
> v5.15. It seems like it results in so many sMMU errors that the system
> cannot function correctly (it's only about a 75% chance the system will
> boot to GUI and even if it does boot successfully the system will hang
> up soon after).
>
> Bisect took me up a couple of blind alleys (mostly due to unrelated boot
> problems in v5.14-rc2) by eventually led me to this patch as the cause.
> Applying/unapplying this patch to a v5.14-rc3 tree will provoke/fix the
> problem and reverting it against v5.15 also resolves the problem.

That's pretty strange. Was the DPAA2 based networking working with this
patch reverted?

> Is there some specific firmware version required for this patch to work
> correctly?

It's a bit of a long story. As Jon already mentioned, we're waiting for
maintainers to agree on the IORT RMR support on which we depend to
declare in UEFI reserved memory regions for the MC firmware.
For now, the recommended workaround is to use the
"arm-smmu.disable_bypass=0" kernel boot arg.

---
Best Regards, Laurentiu

>
>
> PS: Below is the revert I applied to the v5.15 kernel (after
> a fairly simple merge conflict fix)
>
> From 4162b64e4f361a6a773e065b592dbc5493202524 Mon Sep 17 00:00:00 2001
> From: Daniel Thompson <daniel.thompson@xxxxxxxxxx>
> Date: Thu, 11 Nov 2021 16:50:25 +0000
> Subject: [PATCH] Revert "bus: fsl-mc: handle DMA config deferral in ACPI case"
>
> This reverts commit d31e7fe20a2251f87adc6ecefbdaf25e6961ce74 because
> it was causing regressions on my Honeycomb LX2 (NXP LX2060A).
>
> All kernels where the problem manifests (as either a boot hang or a desktop
> hang) issue the following messages in vast number:
>
> ~~~
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm-smmu arm-smmu.0.auto: Unhandled context fault: fsr=0x402, iova=0x23e0000100, fsynr=0x20040, cbfrsynra=0x4000, cb=0
> arm_smmu_context_fault: 1697259 callbacks suppressed
> ~~~
>
> Signed-off-by: Daniel Thompson <daniel.thompson@xxxxxxxxxx>
> ---
> drivers/bus/fsl-mc/fsl-mc-bus.c | 26 ++++++++++++--------------
> 1 file changed, 12 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
> index 8fd4a356a86e..429bacc7de20 100644
> --- a/drivers/bus/fsl-mc/fsl-mc-bus.c
> +++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
> @@ -1130,6 +1130,18 @@ static int fsl_mc_bus_probe(struct platform_device *pdev)
> }
>
> if (mc->fsl_mc_regs) {
> + /*
> + * Some bootloaders pause the MC firmware before booting the
> + * kernel so that MC will not cause faults as soon as the
> + * SMMU probes due to the fact that there's no configuration
> + * in place for MC.
> + * At this point MC should have all its SMMU setup done so make
> + * sure it is resumed.
> + */
> + writel(readl(mc->fsl_mc_regs + FSL_MC_GCR1) &
> + (~(GCR1_P1_STOP | GCR1_P2_STOP)),
> + mc->fsl_mc_regs + FSL_MC_GCR1);
> +
> if (IS_ENABLED(CONFIG_ACPI) && !dev_of_node(&pdev->dev)) {
> mc_stream_id = readl(mc->fsl_mc_regs + FSL_MC_FAPR);
> /*
> @@ -1143,25 +1155,11 @@ static int fsl_mc_bus_probe(struct platform_device *pdev)
> error = acpi_dma_configure_id(&pdev->dev,
> DEV_DMA_COHERENT,
> &mc_stream_id);
> - if (error == -EPROBE_DEFER)
> - return error;
> if (error)
> dev_warn(&pdev->dev,
> "failed to configure dma: %d.\n",
> error);
> }
> -
> - /*
> - * Some bootloaders pause the MC firmware before booting the
> - * kernel so that MC will not cause faults as soon as the
> - * SMMU probes due to the fact that there's no configuration
> - * in place for MC.
> - * At this point MC should have all its SMMU setup done so make
> - * sure it is resumed.
> - */
> - writel(readl(mc->fsl_mc_regs + FSL_MC_GCR1) &
> - (~(GCR1_P1_STOP | GCR1_P2_STOP)),
> - mc->fsl_mc_regs + FSL_MC_GCR1);
> }
>
> /*
>