Re: [PATCH v4 0/3] KVM: Dynamic Halt-Polling

From: David Matlack
Date: Tue Sep 01 2015 - 19:24:58 EST


On Tue, Sep 1, 2015 at 3:58 PM, Wanpeng Li <wanpeng.li@xxxxxxxxxxx> wrote:
> On 9/2/15 6:34 AM, David Matlack wrote:
>>
>> On Tue, Sep 1, 2015 at 3:30 PM, Wanpeng Li <wanpeng.li@xxxxxxxxxxx> wrote:
>>>
>>> On 9/2/15 5:45 AM, David Matlack wrote:
>>>>
>>>> On Thu, Aug 27, 2015 at 2:47 AM, Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
>>>> wrote:
>>>>>
>>>>> v3 -> v4:
>>>>> * bring back grow vcpu->halt_poll_ns when interrupt arrives and
>>>>> shrinks
>>>>> when idle VCPU is detected
>>>>>
>>>>> v2 -> v3:
>>>>> * grow/shrink vcpu->halt_poll_ns by *halt_poll_ns_grow or
>>>>> /halt_poll_ns_shrink
>>>>> * drop the macros and hard coding the numbers in the param
>>>>> definitions
>>>>> * update the comments "5-7 us"
>>>>> * remove halt_poll_ns_max and use halt_poll_ns as the max
>>>>> halt_poll_ns
>>>>> time,
>>>>> vcpu->halt_poll_ns start at zero
>>>>> * drop the wrappers
>>>>> * move the grow/shrink logic before "out:" w/ "if (waited)"
>>>>
>>>> I posted a patchset which adds dynamic poll toggling (on/off switch). I
>>>> think
>>>> this gives you a good place to build your dynamic growth patch on top.
>>>> The
>>>> toggling patch has close to zero overhead for idle VMs and equivalent
>>>> performance VMs doing message passing as always-poll. It's a patch
>>>> that's
>>>> been
>>>> in my queue for a few weeks but just haven't had the time to send out.
>>>> We
>>>> can
>>>> win even more with your patchset by only polling as much as we need (via
>>>> dynamic growth/shrink). It also gives us a better place to stand for
>>>> choosing
>>>> a default for halt_poll_ns. (We can run experiments and see how high
>>>> vcpu->halt_poll_ns tends to grow.)
>>>>
>>>> The reason I posted a separate patch for toggling is because it adds
>>>> timers
>>>> to kvm_vcpu_block and deals with a weird edge case (kvm_vcpu_block can
>>>> get
>>>> called multiple times for one halt). To do dynamic poll adjustment
>
>
> Why this can happen?

Ah, probably because I'm missing 9c8fd1ba220 (KVM: x86: optimize delivery
of TSC deadline timer interrupt). I don't think the edge case exists in
the latest kernel.

>
>
>>>> correctly,
>>>> we have to time the length of each halt. Otherwise we hit some bad edge
>>>> cases:
>>>>
>>>> v3: v3 had lots of idle overhead. It's because vcpu->halt_poll_ns
>>>> grew
>>>> every
>>>> time we had a long halt. So idle VMs looked like: 0 us -> 500 us ->
>>>> 1
>>>> ms ->
>>>> 2 ms -> 4 ms -> 0 us. Ideally vcpu->halt_poll_ns should just stay at
>>>> 0
>>>> when
>>>> the halts are long.
>>>>
>>>> v4: v4 fixed the idle overhead problem but broke dynamic growth for
>>>> message
>>>> passing VMs. Every time a VM did a short halt, vcpu->halt_poll_ns
>>>> would
>>>> grow.
>>>> That means vcpu->halt_poll_ns will always be maxed out, even when
>>>> the
>>>> halt
>>>> time is much less than the max.
>>>>
>>>> I think we can fix both edge cases if we make grow/shrink decisions
>>>> based
>>>> on
>>>> the length of kvm_vcpu_block rather than the arrival of a guest
>>>> interrupt
>>>> during polling.
>>>>
>>>> Some thoughts for dynamic growth:
>>>> * Given Windows 10 timer tick (1 ms), let's set the maximum poll
>>>> time
>>>> to
>>>> less than 1ms. 200 us has been a good value for always-poll. We
>>>> can
>>>> probably go a bit higher once we have your patch. Maybe 500 us?
>
>
> Did you test your patch against a windows guest?

I have not. I tested against a 250HZ linux guest to check how it performs
against a ticking guest. Presumably, windows should be the same, but at a
higher tick rate. Do you have a test for Windows?

>
>>>>
>>>> * The base case of dynamic growth (the first grow() after being at
>>>> 0)
>>>> should
>>>> be small. 500 us is too big. When I run TCP_RR in my guest I see
>>>> poll
>>>> times
>>>> of < 10 us. TCP_RR is on the lower-end of message passing workload
>>>> latency,
>>>> so 10 us would be a good base case.
>>>
>>>
>>> How to get your TCP_RR benchmark?
>>>
>>> Regards,
>>> Wanpeng Li
>>
>> Install the netperf package, or build from here:
>> http://www.netperf.org/netperf/DownloadNetperf.html
>>
>> In the vm:
>>
>> # ./netserver
>> # ./netperf -t TCP_RR
>>
>> Be sure to use an SMP guest (we want TCP_RR to be a cross-core message
>> passing workload in order to test halt-polling).
>
>
> Ah, ok, I use the same benchmark as yours.
>
> Regards,
> Wanpeng Li
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/