Re: [PATCH 3/3] x86/kvm/hyper-v: inject #GP only when invalid SINTx vector is unmasked

From: Roman Kagan
Date: Wed Feb 28 2018 - 11:32:51 EST


On Wed, Feb 28, 2018 at 04:35:59PM +0100, Vitaly Kuznetsov wrote:
> Roman Kagan <rkagan@xxxxxxxxxxxxx> writes:
>
> > On Wed, Feb 28, 2018 at 02:44:01PM +0100, Vitaly Kuznetsov wrote:
> >> Hyper-V 2016 on KVM with SynIC enabled doesn't boot with the following
> >> trace:
> >>
> >> kvm_entry: vcpu 0
> >> kvm_exit: reason MSR_WRITE rip 0xfffff8000131c1e5 info 0 0
> >> kvm_hv_synic_set_msr: vcpu_id 0 msr 0x40000090 data 0x10000 host 0
> >> kvm_msr: msr_write 40000090 = 0x10000 (#GP)
> >> kvm_inj_exception: #GP (0x0)
> >
> > I don't remember having seen this... Does this happen with the mainline
> > QEMU, which doesn't set the SintPollingModeAvailable (17) bit in cpuid
> > 0x40000003:edx?
>
> Yes, you need to have Hyper-V role enabled, kvm-intel modules needs to
> be loaded with 'nesting' support enabled.

Thanks, reproduced now.

> >>
> >> KVM acts according to the following statement from TLFS:
> >>
> >> "
> >> 11.8.4 SINTx Registers
> >> ...
> >> Valid values for vector are 16-255 inclusive. Specifying an invalid
> >> vector number results in #GP.
> >> "
> >>
> >> However, I checked and genuine Hyper-V doesn't #GP when we write 0x10000
> >> to SINTx. I checked with Microsoft and they confirmed that if either the
> >> Masked bit (bit 16) or the Polling bit (bit 18) is set to 1, then they
> >> ignore the value of Vector. Make KVM act accordingly.
> >
> > I wonder if that cpuid setting affects this behavior? Also curious what
> > exactly the guest is trying to achieve writing this bogus value?
>
> The value is actually the default value which is supposed to be there:
>
> "At virtual processor creation time, the default value of all SINTx
> (synthetic interrupt source) registers is 0x0000000000010000." so I
> guess this is just an intialization procedure.

Oh, sorry, I trapped on that polling thing throughout the patch so I
missed that the value you observe has actually only the masked bit set,
which is indeed the initial value (no idea why the guest *writes* it
though, it's the hypervisor's responsibility to put it there on vCPU
reset).

> >>
> >> Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
> >> ---
> >> arch/x86/include/uapi/asm/hyperv.h | 1 +
> >> arch/x86/kvm/hyperv.c | 7 ++++++-
> >> 2 files changed, 7 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/arch/x86/include/uapi/asm/hyperv.h b/arch/x86/include/uapi/asm/hyperv.h
> >> index 62c778a303a1..a492dc357bd7 100644
> >> --- a/arch/x86/include/uapi/asm/hyperv.h
> >> +++ b/arch/x86/include/uapi/asm/hyperv.h
> >> @@ -326,6 +326,7 @@ typedef struct _HV_REFERENCE_TSC_PAGE {
> >> #define HV_SYNIC_SIEFP_ENABLE (1ULL << 0)
> >> #define HV_SYNIC_SINT_MASKED (1ULL << 16)
> >> #define HV_SYNIC_SINT_AUTO_EOI (1ULL << 17)
> >> +#define HV_SYNIC_SINT_POLLING (1ULL << 18)
> >> #define HV_SYNIC_SINT_VECTOR_MASK (0xFF)
> >>
> >> #define HV_SYNIC_STIMER_COUNT (4)
> >> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> >> index 6d14f808145d..d3d866c32976 100644
> >> --- a/arch/x86/kvm/hyperv.c
> >> +++ b/arch/x86/kvm/hyperv.c
> >> @@ -95,9 +95,14 @@ static int synic_set_sint(struct kvm_vcpu_hv_synic *synic, int sint,
> >> u64 data, bool host)
> >> {
> >> int vector, old_vector;
> >> + bool masked, polling;
> >>
> >> vector = data & HV_SYNIC_SINT_VECTOR_MASK;
> >> - if (vector < HV_SYNIC_FIRST_VALID_VECTOR && !host)
> >> + masked = data & HV_SYNIC_SINT_MASKED;
> >> + polling = data & HV_SYNIC_SINT_POLLING;
> >> +
> >> + if (vector < HV_SYNIC_FIRST_VALID_VECTOR &&
> >> + !host && !masked && !polling)
> >> return 1;
> >> /*
> >> * Guest may configure multiple SINTs to use the same vector, so
> >
> > I'm not sure this is enough to implement the polling mode: per spec,
> >
>
> Oh, no, I wasn't trying to -- and by the way we don't currently announce
> SintPollingModeAvailable so guests are not supposed to do that. This is
> rather a future proof to 'not forget'.
>
> >> Setting the polling bit will have the effect of unmasking an interrupt
> >> source, except that an actual interrupt is not generated.
> >
> > However, if the guest sets a valid vector and the masked bit cleared,
> > we'll consider it a usual SINT and add to masks and inject interrupts,
> > etc, regardless of the polling bit.
> >
> > I must admit I'm confused by the above quote from the spec: is the
> > polling bit supposed to come together with the masked bit? If so, then
> > we probably should validate it here (but your logs indicate otherwise).
> > In general I'm missing the utility of this mode: why should an interrupt
> > controller be involved in polling at all?
>
> "Setting the polling bit will have the effect of unmasking an interrupt
> source, except that an actual interrupt is not generated."
>
> So, as I understand it, setting polling bit makes Vector value
> irrelevant - the interrupt is not generated so I *assume* we may see
> writes with zero Vector and polling bit set. But again, we're not
> implementing polling mode for now, I can just drop it from the patch if
> you think it is confusing.

I'd suggest indeed to leave polling out for now, as it seems to be a
different matter from the one being resolved by this patch (unless, of
course, there *are* known cases with invalid vector && !masked &&
polling).

Thanks,
Roman.