Re: [PATCH Part2 v5 00/45] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support

From: Brijesh Singh
Date: Mon Nov 22 2021 - 14:06:31 EST



On 11/22/21 12:30 PM, Dave Hansen wrote:
> On 11/22/21 7:23 AM, Brijesh Singh wrote:
>> Thank you for starting the thread; based on the discussion, I am keeping
>> the current implementation as-is and *not* going with the auto
>> conversion from private to shared. To summarize what we are doing in the
>> current SNP series:
>>
>> - If userspace accesses guest private memory, it gets SIGBUS.
>> - If kernel accesses[*] guest private memory, it does panic.
> There's a subtlety here, though. There are really three *different*
> kinds of kernel accesses that matter:
> 1. Kernel bugs. Kernel goes off and touches some guest private memory
> when it didn't mean to. Say, it runs off the end of a slab page and
> runs into a guest page. panic() is expected here.

In current implementation, a write to guest private will trigger a
kernel panic().


> 2. Kernel accesses guest private memory via a userspace mapping, in a
> place where it is known to be accessing userspace and is prepared to
> fault. copy_to_user() is the most straightforward example. Kernel
> must *not* panic(). Returning an error to the syscall is a good
> way to handle these (if in a syscall).

In the current implementation, the copy_to_user() on the guest private
will fails with -EFAULT.


> 3. Kernel accesses guest private memory via a kernel mapping. This one
> is tricky. These probably *do* result in a panic() today, but
> ideally shouldn't.

KVM has defined some helper functions to maps and unmap the guest pages.
Those helper functions do the GPA to PFN lookup before calling the
kmap(). Those helpers are enhanced such that it check the RMP table
before the kmap() and acquire a lock to prevent a page state change
until the kunmap() is called. So, in the current implementation, we
should *not* see a panic() unless there is a KVM driver bug that didn't
use the helper functions or a bug in the helper function itself.


> Could you explicitly clarify what the current behavior is?