Re: [RFC PATCH] KVM: arm64: Workaround for Ampere AC03_CPU_36 (exception taken to an incorrect EL)

From: Oliver Upton
Date: Fri Jan 05 2024 - 18:53:35 EST


Hi Ilkka,

On Fri, Jan 05, 2024 at 01:32:51PM -0800, Ilkka Koskinen wrote:
> Due to erratum AC03_CPU_36 on AmpereOne, if an Asynchronous Exception
> (interrupts or SErrors) occurs to EL2, while EL2 software is modifying
> system register bits that control EL2 exception behavior, the processor
> may take an exception to an incorrect Exception Level.
>
> The affected system registers are HCR_EL2 and SCTLR_EL2, which contain
> control bits for routing and enabling of EL2 exceptions.
>
> The issue is triggered when HGE.TGE bit is cleared while having
> AMO/IMO/FMO bits cleared too. To avoid the exception getting taken
> at a wrong Exception Level, we set AMO/IMO/FMO.

We toggle HCR_EL2 for other things besides TLB invalidations, and the
changelog does not describe why they're apparently unaffected.

> Suggested-by: D Scott Phillips <scott@xxxxxxxxxxxxxxxxxxxxxx>
> Signed-off-by: Ilkka Koskinen <ilkka@xxxxxxxxxxxxxxxxxxxxxx>

This isn't an acceptable way to go about errata mitigations. Besides
extremely unusual circumstances, the pattern is to use a cpucap &&
alternatives to only enable the workaround on affected designs. We then
document the errata in the expected places (Kconfig and kernel
documentation) such that the folks saddled with maintaining this stuff
know how to handle it years down the line.

> ---
> arch/arm64/kvm/hyp/vhe/tlb.c | 12 +++++++++---
> 1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c
> index b32e2940df7d..c72fdd2e4549 100644
> --- a/arch/arm64/kvm/hyp/vhe/tlb.c
> +++ b/arch/arm64/kvm/hyp/vhe/tlb.c
> @@ -61,9 +61,15 @@ static void __tlb_switch_to_guest(struct kvm_s2_mmu *mmu,
> * has an ISB in order to deal with this.
> */
> __load_stage2(mmu, mmu->arch);
> - val = read_sysreg(hcr_el2);
> - val &= ~HCR_TGE;
> - write_sysreg(val, hcr_el2);
> +
> + /*
> + * With {E2H,TGE} == {1,0}, IMO == 1 is required so that IRQs are not
> + * all masked.

Huh? HCR_EL2.IMO affects the *routing* of IRQs at exception levels
*lower than* EL2.

> This also works around AmpereOne erratum AC03_CPU_36
> + * which can incorrectly route an IRQ to EL1 when HCR_EL2.{E2H,TGE} is
> + * written from {1,1} to {1,0} with interrupts unmasked.
> + */
> + sysreg_clear_set(hcr_el2, HCR_TGE, HCR_AMO | HCR_IMO | HCR_FMO);
> +

Rather than further modifying the exception controls, why not mask DAIF
like we do on guest entry/exit? AFAICT that is the only reason KVM_RUN
is unaffected by this erratum.

This is what I have been using on a few AmpereOne systems: