Re: [PATCH RFC] KVM: x86: vmx: throttle immediate exit through preemtion timer to assist buggy guests

From: Liran Alon
Date: Mon Apr 01 2019 - 06:08:50 EST




> On 1 Apr 2019, at 11:39, Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> wrote:
>
> Paolo Bonzini <pbonzini@xxxxxxxxxx> writes:
>
>> On 29/03/19 16:32, Liran Alon wrote:
>>> Paolo I am not sure this is the case here. Please read my other
>>> replies in this email thread.
>>>
>>> I think this is just a standard issue of a level-triggered interrupt
>>> handler in L1 (Hyper-V) that performs EOI before it lowers the
>>> irq-line. I donât think vector 96 is even related to the issue at
>>> hand here. This is why after it was already handled, the loop of
>>> EXTERNAL_INTERRUPT happens on vector 80 and not vector 96.
>>
>> Hmm... Vitaly, what machine were you testing on---does it have APIC-v?
>> If not, then you should have seen either an EOI for irq 96 or a TPR
>> below threshold vmexit. However, if it has APIC-v then you wouldn't
>> have seen any of this (you only see the EOI for irq 80 because it's
>> level triggered) and Liran is probably right.
>>
>
> It does, however, the issue is reproducible with and without
> it. Moreover, I think the second simultaneous IRQ is just a red herring;
> Here is another trace (enable_apicv). Posting it non-stripped and hope
> your eyes will catch something I'm missing:
>
> [001] 513675.736316: kvm_exit: reason VMRESUME rip 0xfffff80002cae115 info 0 0
> [001] 513675.736321: kvm_entry: vcpu 0
> [001] 513675.736565: kvm_exit: reason EXTERNAL_INTERRUPT rip 0xfffff80362dcd26d info 0 800000ec
> [001] 513675.736566: kvm_nested_vmexit: rip fffff80362dcd26d reason EXTERNAL_INTERRUPT info1 0 info2 0 int_info 800000ec int_info_err 0
> [001] 513675.736568: kvm_entry: vcpu 0
> [001] 513675.736650: kvm_exit: reason EPT_VIOLATION rip 0xfffff80362dcd230 info 182 0
> [001] 513675.736651: kvm_nested_vmexit: rip fffff80362dcd230 reason EPT_VIOLATION info1 182 info2 0 int_info 0 int_info_err 0
> [001] 513675.736651: kvm_page_fault: address 261200000 error_code 182
>
> -> injecting
>
> [008] 513675.737059: kvm_set_irq: gsi 23 level 1 source 0
> [008] 513675.737061: kvm_msi_set_irq: dst 0 vec 80 (Fixed|physical|level)
> [008] 513675.737062: kvm_apic_accept_irq: apicid 0 vec 80 (Fixed|edge)
> [001] 513675.737233: kvm_nested_vmexit_inject: reason EXTERNAL_INTERRUPT info1 0 info2 0 int_info 80000050 int_info_err 0
> [001] 513675.737239: kvm_entry: vcpu 0
> [001] 513675.737243: kvm_exit: reason EOI_INDUCED rip 0xfffff80002c85e1a info 50 0
>
> -> immediate EOI causing re-injection (even preemption timer is not
> involved here).
>
> [001] 513675.737244: kvm_eoi: apicid 0 vector 80
> [001] 513675.737245: kvm_fpu: unload
> [001] 513675.737246: kvm_userspace_exit: reason KVM_EXIT_IOAPIC_EOI (26)
> [001] 513675.737256: kvm_set_irq: gsi 23 level 1 source 0
> [001] 513675.737259: kvm_msi_set_irq: dst 0 vec 80 (Fixed|physical|level)
> [001] 513675.737260: kvm_apic_accept_irq: apicid 0 vec 80 (Fixed|edge)
> [001] 513675.737264: kvm_fpu: load
> [001] 513675.737265: kvm_entry: vcpu 0
> [001] 513675.737271: kvm_exit: reason VMRESUME rip 0xfffff80002cae115 info 0 0
> [001] 513675.737278: kvm_entry: vcpu 0
> [001] 513675.737282: kvm_exit: reason PREEMPTION_TIMER rip 0xfffff80362dcc2d0 info 0 0
> [001] 513675.737283: kvm_nested_vmexit: rip fffff80362dcc2d0 reason PREEMPTION_TIMER info1 0 info2 0 int_info 0 int_info_err 0
> [001] 513675.737285: kvm_nested_vmexit_inject: reason EXTERNAL_INTERRUPT info1 0 info2 0 int_info 80000050 int_info_err 0
> [001] 513675.737289: kvm_entry: vcpu 0
> [001] 513675.737293: kvm_exit: reason EOI_INDUCED rip 0xfffff80002c85e1a info 50 0
> [001] 513675.737293: kvm_eoi: apicid 0 vector 80
> [001] 513675.737294: kvm_fpu: unload
> [001] 513675.737295: kvm_userspace_exit: reason KVM_EXIT_IOAPIC_EOI (26)
> [001] 513675.737299: kvm_set_irq: gsi 23 level 1 source 0
> [001] 513675.737299: kvm_msi_set_irq: dst 0 vec 80 (Fixed|physical|level)
> [001] 513675.737300: kvm_apic_accept_irq: apicid 0 vec 80 (Fixed|edge)
> [001] 513675.737302: kvm_fpu: load
> [001] 513675.737303: kvm_entry: vcpu 0
> [001] 513675.737307: kvm_exit: reason VMRESUME rip 0xfffff80002cae115 info 0 0
>
> ...
>
> --
> Vitaly

So to sum-up: This matches what I mentioned in my previous emails right?
That vector 96 is not related, and the only issue here is that level-triggered interrupt handler for vector 80 is doing EOI before lowering the irq-line.
Which cause vector 80 to be injected in infinite loop.
And this is not even related to being a nested virtualization workload. Itâs just an issue in Hyper-V (L1) interrupt handler for vector 80.

Therefore the only action-items are:
1) Microsoft to fix Hyper-V vector 80 interrupt handler to lower irq-line before EOI.
2) Patch QEMU IOAPIC implementation to have a mechanism similar to KVM to delay injection of level-triggered interrupt
in case we are injecting the same interrupt for X times in a row.

-Liran