Re: [PATCH 3/4] KVM: x86: introduce get_kvmclock_ns

From: Roman Kagan
Date: Fri Sep 02 2016 - 11:25:04 EST


On Fri, Sep 02, 2016 at 04:09:42PM +0200, Paolo Bonzini wrote:
> On 02/09/2016 15:52, Roman Kagan wrote:
> > On Thu, Sep 01, 2016 at 05:26:14PM +0200, Paolo Bonzini wrote:
> >> --- a/arch/x86/kvm/hyperv.c
> >> +++ b/arch/x86/kvm/hyperv.c
> >> @@ -386,7 +386,7 @@ static void synic_init(struct kvm_vcpu_hv_synic *synic)
> >>
> >> static u64 get_time_ref_counter(struct kvm *kvm)
> >> {
> >> - return div_u64(get_kernel_ns() + kvm->arch.kvmclock_offset, 100);
> >> + return div_u64(get_kvmclock_ns(kvm), 100);
> >
> > Since this does slightly different calculation than the real hyperv tsc
> > ref page clock is supposed to, I wonder if we are safe WRT precision
> > errors leading to occasional monotonicity violations?
>
> The Hyper-V scale is
>
> tsc_to_system_mul * 2^(32+tsc_shift) / 100
>
> and the only source of error could be from doing here
>
> (tsc * tsc_to_system_mul >> (32-tsc_shift)) / 100
>
> vs
>
> tsc * ((tsc_to_system_mul >> (32-tsc_shift)) / 100))
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> this is scale / 2^64
>
> in the TSC ref page clock. If my reasoning is correct the error will be
> at most 100 units of the scale value, which is a relative error of
> around 1 parts per 2^49.
>
> Likewise for the offset, the improvement from
>
> (tsc - base_tsc) * tsc_to_system_mul >> (32-tsc_shift)
> + base_system_time
>
> vs. using a single offset as in the TSC ref page is one nanosecond---and
> the ref page only has a resolution of 100 ns.

AFAICS it's not a matter of resolution. If one calculation flips from
value T to T+1 at tsc1, while the other at tsc2, during the window
between tsc1 and tsc2 we can have monotonicity violation. If the window
is a few cycles (i.e. less than a vmexit) we're probably safe, but if
it's not this may be a problem.

Roman.