RE: [PATCH v4] arm64: Add workaround for Fujitsu A64FX erratum 010001

From: Zhang, Lei
Date: Wed Feb 27 2019 - 01:20:15 EST


Hi James,

> -----Original Message-----
> From: linux-arm-kernel <linux-arm-kernel-bounces@xxxxxxxxxxxxxxxxxxx> On
> Behalf Of James Morse
> Sent: Tuesday, February 26, 2019 2:29 AM
> To: Zhang, Lei/張 雷 <zhang.lei@xxxxxxxxxxxxxx>
> Cc: Mark Rutland <mark.rutland@xxxxxxx>; 'Catalin Marinas'
> <catalin.marinas@xxxxxxx>; 'Will Deacon' <will.deacon@xxxxxxx>;
> 'linux-kernel@xxxxxxxxxxxxxxx' <linux-kernel@xxxxxxxxxxxxxxx>;
> 'linux-arm-kernel@xxxxxxxxxxxxxxxxxxx' <linux-arm-kernel@xxxxxxxxxxxxxxxxxxx>
> Subject: Re: [PATCH v4] arm64: Add workaround for Fujitsu A64FX erratum
> 010001
>
> Hi Zhang,
>
> On 23/02/2019 13:06, Zhang, Lei wrote:
> > Zhang, Lei wrote:
> >> I think you mean it may be a problem to modify the KPTI trampoline
> >> because some patches about KPTI will be merged to mainline in the near
> future.
> >> I understood that.
> >> I should discuss with my colleagues whether we can set NFDx=0 all of
> >> time on A64FX.
> >
> > The result of our investigation also supports your suggestion.
> > We surely agree with you that your proposed method (never set NFDx=1
> > on A64FX) is the best to resolve this erratum.
> >
> > For this erratum, James's patch should be merged to mainline instead
> > of my previous patches (v1 to v4).
> > Since KPTI fully covers the effect of NFD1 for A64FX, KPTI is
> > recommended to be used in conjunction with James’s patch.
>
> >> And thanks for your patch.
> >> If we can set NFDx=0 all of time, I will review, test and report the result.
> >
> > I have already tested James's patch on A64FX, and the result is no problem at
> all.
> >
> > Tested-by:zhang.lei<zhang.lei@xxxxxxxxxxxxxx>
>
> Thanks, I'll post it properly with this tag.
[>]
I saw v5 patch you posted. Thanks a lot.

>
>
> >> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index
> >> a4168d366127..b0b7f1c4e816 100644
> >> --- a/arch/arm64/Kconfig
> >> +++ b/arch/arm64/Kconfig
> >> @@ -643,6 +643,25 @@ config QCOM_FALKOR_ERRATUM_E1041
> >>
> >> If unsure, say Y.
> >>
> >> +config FUJITSU_ERRATUM_010001
> >> + bool "Fujitsu-A64FX erratum E#010001: Undefined fault may occur
> wrongly"
> >> + default y
> >> + help
> >> + This option adds workaround for Fujitsu-A64FX erratum E#010001.
> >> + On some variants of the Fujitsu-A64FX cores version (1.0, 1.1),
> memory
> >> + accesses may cause undefined fault (Data abort, DFSC=0b111111).
> >> + This fault occurs under a specific hardware condition when a
> >> + load/store instruction performs an address translation using:
> >> + case-1 TTBR0_EL1 with TCR_EL1.NFD0 == 1.
> >> + case-2 TTBR0_EL2 with TCR_EL2.NFD0 == 1.
> >> + case-3 TTBR1_EL1 with TCR_EL1.NFD1 == 1.
> >> + case-4 TTBR1_EL2 with TCR_EL2.NFD1 == 1.
> >> +
> >> + The workaround is to ensure these bits are clear in TCR_ELx.
> >> + The workaround only affect the Fujitsu-A64FX.
> >
> > I think it is better to add a notice here as follows:
> >
> > Recommend to enable KPTI (UNMAP_KERNEL_AT_EL0 = y).
>
> That unmap option is on by default, you can't turn it off without
> CONFIG_EXPERT. While I agree, I don't think we need to spell this out.
[>]
I agree with you (that there is no need to mention here).
Thank you for your suggestion.

Best Regards,
Zhang Lei