Re: [PATCH Part1 RFC v3 11/22] x86/sev: Add helper for validating pages in early enc attribute changes

From: Dr. David Alan Gilbert
Date: Wed Jun 16 2021 - 09:07:08 EST


* Brijesh Singh (brijesh.singh@xxxxxxx) wrote:
>
> On 6/16/21 7:03 AM, Borislav Petkov wrote:
> > On Wed, Jun 16, 2021 at 06:00:09AM -0500, Brijesh Singh wrote:
> >> I am trying to be consistent with previous VMGEXIT implementations. If
> >> the command itself failed then use the command specific error code to
> >> tell hypervisor why we terminated but if the hypervisor violated the
> >> GHCB specification then use the "general request termination".
> > I feel like we're running in circles here: I ask about debuggability
> > and telling the user what exactly failed and you're giving me some
> > explanation about what the error codes mean. I can see what they mean.
> >
> > So let me try again:
> >
> > Imagine you're a guest owner and you haven't written the SNP code and
> > you don't know how it works.
> >
> > You start a guest in the public cloud and it fails because the
> > hypervisor violates the GHCB protocol and all that guest prints before
> > it dies is
> >
> > "general request termination"
>
>
> The GHCB specification does not define a unique error code for every
> possible condition. Now that we have reserved reason set 1 for the
> Linux-specific error code, we could add a new error code to cover the
> cases for the protocol violation. I was highlighting that we should not
> overload the meaning of GHCB_TERM_PSC. In my mind, the GHCB_TERM_PSC
> error code is used when the guest sees that the hypervisor failed to
> change the state . The failure maybe because the guest provided a bogus
> GPA or invalid operation code, or RMPUPDATE failure or HV does not
> support SNP feature etc etc. But in this case, the failure was due to
> the protocol error, and IMO we should not use the GHCB_TERM_PSC.
> Additionally, we should also update CPUID and other VMGEXITs to use the
> new error code instead of "general request termination" so that its
> consistent.
>
>
> If you still think that GHCB_TERM_PSC is valid here, then I am okay with it.

I'd kind of agree with Borislav, the more hints we can have as to the
actual failure reason the better - so if you've got multiple cases
where the guest thinks the hypervisor has screwed up, find a way to give
an error code to tell us which one.

Dave

> -Brijesh
>
>
>
--
Dr. David Alan Gilbert / dgilbert@xxxxxxxxxx / Manchester, UK