Re: [PATCH 2/2] arm64: errata: Add Cortex-A520 speculative unprivileged load workaround

From: Marc Zyngier
Date: Wed Sep 20 2023 - 13:17:32 EST


On Wed, 20 Sep 2023 17:47:35 +0100,
Rob Herring <robh@xxxxxxxxxx> wrote:
>
> On Tue, Sep 19, 2023 at 7:50 AM Marc Zyngier <maz@xxxxxxxxxx> wrote:
> >
> > On Tue, 19 Sep 2023 13:29:07 +0100,
> > Rob Herring <robh@xxxxxxxxxx> wrote:
> > >
> > > On Mon, Sep 18, 2023 at 5:18 AM Marc Zyngier <maz@xxxxxxxxxxxxxxx> wrote:
> > > >
> > > > On 2023-09-18 11:01, Will Deacon wrote:
> > > > > On Tue, Sep 12, 2023 at 07:11:15AM -0500, Rob Herring wrote:
> > > > >> Implement the workaround for ARM Cortex-A520 erratum 2966298. On an
> > > > >> affected Cortex-A520 core, a speculatively executed unprivileged load
> > > > >> might leak data from a privileged level via a cache side channel.
> > > > >>
> > > > >> The workaround is to execute a TLBI before returning to EL0. A
> > > > >> non-shareable TLBI to any address is sufficient.
> > > > >
> > > > > Can you elaborate at all on how this works, please? A TLBI addressing a
> > > > > cache side channel feels weird (or is "cache" referring to some TLB
> > > > > structures rather than e.g. the data cache here?).
> > > > >
> > > > > Assuming there's some vulnerable window between the speculative
> > > > > unprivileged load and the completion of the TLBI, what prevents another
> > > > > CPU from observing the side-channel during that time? Also, does the
> > > > > TLBI need to be using the same ASID as the unprivileged load? If so,
> > > > > then
> > > > > a context-switch could widen the vulnerable window quite significantly.
> > > >
> > > > Another 'interesting' case is the KVM world switch. If EL0 is
> > > > affected, what about EL1? Can such a data leak exist cross-EL1,
> > > > or from EL2 to El1? Asking for a friend...
> > >
> > > I'm checking for a definitive answer, but page table isolation also
> > > avoids the issue. Wouldn't these scenarios all be similar to page
> > > table isolation in that the EL2 or prior EL1 context is unmapped?
> >
> > No, EL2 is always mapped, and we don't have anything like KPTI there.
> >
> > Maybe the saving grace is that EL2 and EL2&0 are different translation
> > regimes from EL1&0, but there's nothing in the commit message that
> > indicates it. As for EL1-to-EL1 leaks, it again completely depends on
> > how the TLBs are tagged.
>
> Different translation regimes are not affected. It must be the same
> regime and same translation.

It would be good to capture this, then.

>
> > You'd hope that having different VMIDs would save the bacon, but if
> > you can leak EL1 translations into EL0, it means that the associated
> > permission and/or tags do not contain all the required information...
>
> The VMID is part of the equation. See here[1].

I have a pretty good idea of how TLB are *supposed* to behave. The
fact that you need some sort of invalidation on ERET to EL0 is the
proof that this CPU doesn't follow these rules to the letter...

M.

--
Without deviation from the norm, progress is not possible.