Re: [PATCH v2 0/2] KVM: arm64: Support for Arm v8.8 memcpy instructions in KVM guests

From: Marc Zyngier
Date: Fri Sep 29 2023 - 05:29:27 EST


On Thu, 28 Sep 2023 17:55:39 +0100,
Kristina Martsenko <kristina.martsenko@xxxxxxx> wrote:
>
> On 27/09/2023 07:00, Oliver Upton wrote:
> > Hi Kristina,
>
> Hi Oliver,
>
> >
> > On Fri, Sep 22, 2023 at 12:25:06PM +0100, Kristina Martsenko wrote:
> >> Hi,
> >>
> >> This is v2 of the series to allow using the new Arm memory copy instructions
> >> in KVM guests. See v1 for more information [1].
> >
> >
> > Thanks for sending out the series. I've been thinking about what the
> > architecture says for MOPS, and I wonder if what's currently in the
> > Arm ARM is clear enough for EL1 software to be written robustly.
> >
> > While HCRX_EL2.MCE2 allows the hypervisor to intervene on MOPS
> > exceptions from EL1, there's no such control for EL0. So when vCPU
> > migration occurs EL1 could get an unexpected MOPS exception, even for a
> > process that was pinned to a single (virtual) CPU implementation.
> >
> > Additionally, the wording of I_NXHPS seems to suggest that EL2 handling
> > of MOPS exceptions is only expected in certain circumstances where EL1 is
> > incapable of handling an exception. Is the unwritten expectation then
> > that EL1 software should tolerate 'unexpected' MOPS exceptions from EL1
> > and EL0, even if EL1 did not migrate the PE context?
> >
> > Perhaps I'm being pedantic, but I'd really like for there to be some
> > documentation that suggests MOPS exceptions can happen due to context
> > migration done by a higher EL as that is the only option in the context
> > of virtualization.
>
> That's a good point. This shouldn't affect Linux guests as Linux is
> always able to handle a MOPS exception coming from EL0. But it would
> affect any non-Linux guest that pins all its EL0 tasks and doesn't
> implement a handler. It's not clear to me what the expectation for
> guests is, I'll ask the architects to clarify and get back to you.

My understanding is that MCE2 should always be set if the hypervisor
can migrate vcpus across implementations behind EL1's back, and that
in this context, EL1 never sees such an exception.

I guess the only case where we could let EL1 handle such exception is
by only setting MCE2 on the first entry into the guest after a vcpu
migration (and clear it after that). Is it worth the effort?
Absolutely not.

M.

--
Without deviation from the norm, progress is not possible.