Re: [PATCH v5 02/22] KVM: arm64: Add SDEI virtualization infrastructure

From: Gavin Shan
Date: Wed Mar 23 2022 - 08:40:52 EST


Hi Oliver,

On 3/23/22 6:43 AM, Oliver Upton wrote:
On Tue, Mar 22, 2022 at 04:06:50PM +0800, Gavin Shan wrote:
Software Delegated Exception Interface (SDEI) provides a mechanism for
registering and servicing system events. Those system events are high
priority events, which must be serviced immediately. It's going to be
used by Asynchronous Page Fault (APF) to deliver notification from KVM
to guest. It's noted that SDEI is defined by ARM DEN0054C specification.

I'm guessing that you're using linked lists for stitching all of this
together because the specification provides for 24 bits of event
encoding. However, it seems that there will be a finite number of events
in KVM. So the APF stuff and a software signaled event.

Given that the number of events in KVM is rather small, would it make
more sense to do away with the overhead of linked lists and having the
state just represented in a static or allocated array? I think you can
cram all of the VM scoped event state into a single structure and hang
the implementation off of that.


Yes, the number of events in KVM is small. Including the events for Async
PF and the software signaled event, 8 events would be enough currently. In
the meanwhile, there are several types of objects for various events. Some
of them can be put into static array, while the left might need static array
of pointers to avoid the linked list:

struct kvm_sdei_exposed_event/state on struct kvm_arch
size: 24 bytes
static array, 8 entries
struct kvm_sdei_registered_event/state on struct kvm_arch
size: 9KB
static array of pointers, still need allocate them dynamically, 8 entries
struct kvm_sdei_vcpu_event/state on struct kvm_vcpu_arch
size: 16 bytes
static array, 8 entries


This introduces SDEI virtualization infrastructure where the SDEI events
are registered and manipulated by the guest through hypercall. The SDEI
event is delivered to one specific vCPU by KVM once it's raised. This
introduces data structures to represent the needed objects to support
the feature, which is highlighted as below. As those objects could be
migrated between VMs, these data structures are partially exposed to
user space.

* kvm_sdei_exposed_event
The exposed events are determined and added by VMM through ioctl
interface. Only the exposed events can be registered from the
guest.

* kvm_sdei_registered_event
The events that have been registered from the guest through the
SDEI_1_0_FN_SDEI_EVENT_REGISTER hypercall.

* kvm_sdei_vcpu_event
The events that have been delivered to the target vCPU.

* kvm_sdei_vcpu
Used to save the preempted context when the SDEI event is serviced
and delivered. After the SDEI event handling is completed, the
execution is resumed from the preempted context.

* kvm_sdei_kvm
Place holder for the exposed and registered events.

It might be a good idea to expand these a bit and move them into
comments on each of the structures.


Sure, I will do in next respin.

The error of SDEI_NOT_SUPPORTED is returned for all SDEI hypercalls for
now. They will be supported in the subsequent patches.

Signed-off-by: Gavin Shan <gshan@xxxxxxxxxx>
---
arch/arm64/include/asm/kvm_host.h | 3 +
arch/arm64/include/asm/kvm_sdei.h | 171 +++++++++++++
arch/arm64/include/uapi/asm/kvm.h | 1 +
arch/arm64/include/uapi/asm/kvm_sdei_state.h | 72 ++++++
arch/arm64/kvm/Makefile | 2 +-
arch/arm64/kvm/arm.c | 8 +
arch/arm64/kvm/hypercalls.c | 21 ++
arch/arm64/kvm/sdei.c | 244 +++++++++++++++++++
include/uapi/linux/arm_sdei.h | 2 +
9 files changed, 523 insertions(+), 1 deletion(-)
create mode 100644 arch/arm64/include/asm/kvm_sdei.h
create mode 100644 arch/arm64/include/uapi/asm/kvm_sdei_state.h
create mode 100644 arch/arm64/kvm/sdei.c

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 031e3a2537fc..5d37e046a458 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -113,6 +113,8 @@ struct kvm_arch {
/* Interrupt controller */
struct vgic_dist vgic;
+ struct kvm_sdei_kvm *sdei;
+

nit: avoid repeating 'kvm'. struct kvm_sdei should be descriptive enough
:)


Indeed, "struct kvm_sdei" is better :)

/* Mandated version of PSCI */
u32 psci_version;
@@ -338,6 +340,7 @@ struct kvm_vcpu_arch {
* Anything that is not used directly from assembly code goes
* here.
*/
+ struct kvm_sdei_vcpu *sdei;


nit: put your scoping tokens at the beginning of a symbol name, so
'struct kvm_vcpu_sdei'.

[...]


Yep, "struct kvm_vcpu_sdei" is the one I will have in next respin :)

diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
index 202b8c455724..3c20fee72bb4 100644
--- a/arch/arm64/kvm/hypercalls.c
+++ b/arch/arm64/kvm/hypercalls.c
@@ -5,6 +5,7 @@
#include <linux/kvm_host.h>
#include <asm/kvm_emulate.h>
+#include <asm/kvm_sdei.h>
#include <kvm/arm_hypercalls.h>
#include <kvm/arm_psci.h>
@@ -151,6 +152,26 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
case ARM_SMCCC_TRNG_RND32:
case ARM_SMCCC_TRNG_RND64:
return kvm_trng_call(vcpu);
+ case SDEI_1_0_FN_SDEI_VERSION:
+ case SDEI_1_0_FN_SDEI_EVENT_REGISTER:
+ case SDEI_1_0_FN_SDEI_EVENT_ENABLE:
+ case SDEI_1_0_FN_SDEI_EVENT_DISABLE:
+ case SDEI_1_0_FN_SDEI_EVENT_CONTEXT:
+ case SDEI_1_0_FN_SDEI_EVENT_COMPLETE:
+ case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME:
+ case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
+ case SDEI_1_0_FN_SDEI_EVENT_STATUS:
+ case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
+ case SDEI_1_0_FN_SDEI_EVENT_ROUTING_SET:
+ case SDEI_1_0_FN_SDEI_PE_MASK:
+ case SDEI_1_0_FN_SDEI_PE_UNMASK:
+ case SDEI_1_0_FN_SDEI_INTERRUPT_BIND:
+ case SDEI_1_0_FN_SDEI_INTERRUPT_RELEASE:
+ case SDEI_1_1_FN_SDEI_EVENT_SIGNAL:
+ case SDEI_1_0_FN_SDEI_PRIVATE_RESET:
+ case SDEI_1_0_FN_SDEI_SHARED_RESET:
+ case SDEI_1_1_FN_SDEI_FEATURES:

Consider only adding switch statements for hypercalls when they're
actually implemented.

Additionally, while this isn't directly related to your patch, I do have
a gripe about kvm_hvc_call_handler(). It is really ugly that we
enumerate the specific hypercalls we support, and otherwise fall through
to PSCI.

IMO, a cleaner approach would be to have kvm_hvc_call_handler() simply
route a particular service range/service owner to the appropriate
handler. We can then terminate individual hypercalls in those handlers,
avoiding a catch-all switch such as this one is currently.


Yes, I agree. I can have a separate patch as preparatory work to
route the range of hypercalls to their owner for further handling.
In this way, we can route the range of SDEI hypercalls to its own
handler. I will figure it out in next respin.

Thanks,
Gavin