Re: [PATCH v2 01/14] KVM: x86: change PIT discard tick policy

From: Radim KrÄmÃÅ
Date: Fri Feb 19 2016 - 09:44:32 EST


[Cc'd Peter, the last guy that touched timers in libvirt, because he
might know what tick policies are supposed to be.]

2016-02-18 18:55+0100, Paolo Bonzini:
> On 18/02/2016 18:33, Paolo Bonzini wrote:
>> On 18/02/2016 17:56, Radim KrÄmÃÅ wrote:
>>> 2016-02-18 17:13+0100, Paolo Bonzini:
>>>> On 17/02/2016 20:14, Radim KrÄmÃÅ wrote:
>>>>> Discard policy uses ack_notifiers to prevent injection of PIT interrupts
>>>>> before EOI from the last one.
>>>>>
>>>>> This patch changes the policy to always try to deliver the interrupt,
>>>>> which makes a difference when its vector is in ISR.
>>>>> Old implementation would drop the interrupt, but proposed one injects to
>>>>> IRR, like real hardware would.
>>>>
>>>> This seems like what libvirt calls the "merge" policy:
>>>
>>> Oops, I never looked beyond QEMU after seeing that the naming in libvirt
>>> doesn't even match ...
>>>
>>> I think the policy that KVM implements (which I call discard) is "delay"
>>> in libvirt. (https://libvirt.org/formatdomain.html#elementsTime)

(I looked at libvirt code, but couldn't find any use of merge or discard
policies, so please bear with me as I disagree wherever it's possible.)

>> Suppose the scheduled ticks are at times 0, 20, 40, 60, 80. The EOI for
>> time 0 is only delivered at time 42, other EOIs are timely.
>>
>> The resulting injections are:
>> - for catchup, which QEMU calls slew: 0, 42, 51, 60, 80.
>>
>> - for merge: 0, 20 (in IRR, delivered at 42), 60, 80.
>>
>> For delay I *think* it would be 0, 42, 62, 82, 102.

I could call this "delay".

Continue to deliver ticks at the normal rate. The guest time will be
delayed due to the late tick

At 82 time units, the guest thinks it's 60, so the guest will do
everything late. (Leading us to call it delayed?!)

Few examples of "delay" that I find easier to accept:
0, 60, 80.
0, 42, 60, 80. Because we haven't missed the tick at 20, it just took
a while to be delivered. (Semantics ...)

> Wrong: for delay it is something like 0, 42, 43, 60, 80.

Aargh! One KVM policy does this and QEMU calls it 'delay'. I think
that libvirt would call it "catchup".

Deliver ticks at a higher rate to catch up with the missed tick. The
guest time should not be delayed once catchup is complete.

At 80, the guest time is 80; no signs of delay.

> Your patch does the right thing, QEMU is wrong in calling the policy
> "discard" where it should have been "merge". In fact both i8254 and RTC
> use the same wrong nomenclature.

Terminlogy does suck. (Maybe it stems from the fact that QEMU talks
about lost ticks, but libvirt about ticks?)
Nevertheless, I don't think that libvirt "merge" covers what PIT does in
KVM or real hardware.

Merge the missed tick(s) into one tick and inject. The guest time may
be delayed, depending on how the OS reacts to the merging of ticks

No merging is happening in KVM or real hardware: every tick is exactly
one tick, so the guest cannot tell that we missed some ticks and the
time is delayed. If a tick made it into clear IRR, it's not missed.

In the example:

>> - for merge: 0, 20 (in IRR, delivered at 42), 60, 80.

at 80, the guest thinks it's 60.

I think that merge might do: 0, 42, 60, 80.
But the tick at 42 is counted as two ticks (20, 40) in the guest.

The main problem of this interpretation is that discard is a subset of
merge:

>> - for discard: 0, 60, 80.

The tick at 60 has to be counted as three ticks (20, 40, 60).

*throws hands into the air and runs in circles*