Re: [PATCH v6 03/18] KVM: arm64: Add SDEI virtualization infrastructure

From: Oliver Upton
Date: Thu Apr 28 2022 - 16:28:30 EST


Hi Gavin,

On Sun, Apr 24, 2022 at 11:00:56AM +0800, Gavin Shan wrote:

[...]

> Yes, The assumption that all events are always singled by software should
> be true. So this field (@signaled) can be dropped either. So I plan to
> change the data structures like below, according to the suggestions given
> by you. Please double check if there are anything missed.
>
> (1) Those fields of struct kvm_sdei_exposed_event are dropped or merged
> to struct kvm_sdei_event.
>
> struct kvm_sdei_event {
> unsigned int num;
> unsigned long ep_addr;
> unsigned long ep_arg;
> #define KVM_SDEI_EVENT_STATE_REGISTERED 0
> #define KVM_SDEI_EVENT_STATE_ENABLED 1
> #define KVM_SDEI_EVENT_STATE_UNREGISTER_PENDING 2
> unsigned long state; /* accessed by {test,set,clear}_bit() */
> unsigned long event_count;
> };
>
> (2) In arch/arm64/kvm/sdei.c
>
> static kvm_sdei_event exposed_events[] = {
> { .num = SDEI_SW_SIGNALED_EVENT },
> };
>
> (3) In arch/arm64/kvm/sdei.c::kvm_sdei_create_vcpu(), the SDEI events
> are instantiated based on @exposed_events[]. It's just what we're
> doing and nothing is changed.

The part I find troubling is the fact that we are treating SDEI events
as a list-like thing. If we want to behave more like hardware, why can't
we track the state of an event in bitmaps? There are three bits of
relevant state for any given event in the context of a vCPU: registered,
enabled, and pending.

I'm having some second thoughts about the suggestion to use MP state for
this, given that we need to represent a few bits of state for the vCPU
as well. Seems we need to track the mask state of a vCPU and a bit to
indicate whether an SDEI handler is active. You could put these bits in
kvm_vcpu_arch::flags, actually.

So maybe it could be organized like so:

/* bits for the bitmaps below */
enum kvm_sdei_event {
KVM_SDEI_EVENT_SW_SIGNALED = 0,
KVM_SDEI_EVENT_ASYNC_PF,
...
NR_KVM_SDEI_EVENTS,
};

struct kvm_sdei_event_handler {
unsigned long ep_addr;
unsigned long ep_arg;
};

struct kvm_sdei_event_context {
unsigned long pc;
unsigned long pstate;
unsigned long regs[18];
};

struct kvm_sdei_vcpu {
unsigned long registered;
unsigned long enabled;
unsigned long pending;

struct kvm_sdei_event_handler handlers[NR_KVM_SDEI_EVENTS];
struct kvm_sdei_event_context ctxt;
};

But it is hard to really talk about these data structures w/o a feel for
the mechanics of working the series around it.

> > > > Do we need this if we disallow nesting events?
> > > >
> > >
> > > Yes, we need this. "event == NULL" is used as indication of invalid
> > > context. @event is the associated SDEI event when the context is
> > > valid.
> >
> > What if we use some other plumbing to indicate the state of the vCPU? MP
> > state comes to mind, for example.
> >
>
> Even the indication is done by another state, kvm_sdei_vcpu_context still
> need to be linked (associated) with the event. After the vCPU context becomes
> valid after the event is delivered, we still need to know the associated
> event when some of hypercalls are triggered. SDEI_1_0_FN_SDEI_EVENT_COMPLETE
> is one of the examples, we need to decrease struct kvm_sdei_event::event_count
> for the hypercall.

Why do we need to keep track of how many times an event has been
signaled? Nothing in SDEI seems to suggest that the number of event
signals corresponds to the number of times the handler is invoked. In
fact, the documentation on SDEI_EVENT_SIGNAL corroborates this:

"""
The event has edgetriggered semantics and the number of event signals
may not correspond to the number of times the handler is invoked in the
target PE.
"""

DEN0054C 5.1.16.1

So perhaps we queue at most 1 pending event for the guest.

I'd also like to see if anyone else has thoughts on the topic, as I'd
hate for you to go back to the whiteboard again in the next spin.

--
Thanks,
Oliver