Re: [PATCH 4/4] KVM: x86: track actual TSC frequency from the timekeeper struct

From: Marcelo Tosatti
Date: Fri Feb 19 2016 - 09:12:28 EST


On Tue, Feb 16, 2016 at 05:59:57PM +0100, Paolo Bonzini wrote:
>
>
> On 16/02/2016 15:25, Marcelo Tosatti wrote:
> > On Tue, Feb 16, 2016 at 02:48:16PM +0100, Marcelo Tosatti wrote:
> >> On Mon, Feb 08, 2016 at 04:18:31PM +0100, Paolo Bonzini wrote:
> >>> When an NTP server is running, it may adjust the time substantially
> >>> compared to the "official" frequency of the TSC. A 12 ppm change
> >>> sums up to one second per day.
> >>>
> >>> This already shows up if the guest compares kvmclock with e.g. the
> >>> PM timer. It shows up even more once we add support for the Hyper-V
> >>> TSC page, because the guest expects it to be in sync with the time
> >>> reference counter; effectively the time reference counter is just a
> >>> slow path to access the same clock that is in the TSC page.
> >>>
> >>> Therefore, we want kvmclock to provide the host kernel's
> >>> ktime_get_boot_ns() value, at least if the master clock is active.
> >>> To do so, reverse-compute the host's "actual" TSC frequency from
> >>> pvclock_gtod_data and return it from kvm_get_time_and_clockread.
> >>
> >> Paolo,
> >>
> >> You'd have to generate an update to the guest structures as well,
> >> to reflect the new {mult,shift} values calculated by the host.
> >> Here:
> >>
> >> /* disable master clock if host does not trust, or does not
> >> * use, TSC clocksource
> >> */
> >> if (gtod->clock.vclock_mode != VCLOCK_TSC &&
> >> atomic_read(&kvm_guest_has_master_clock) != 0)
> >> queue_work(system_long_wq, &pvclock_gtod_work);
> >>
> >> No?
> >>
> >> At first, i'm afraid this might be heavy, so it might be interesting
> >> to rate limit the update operation.
> >>
> >
> > Paolo,
> >
> > I suppose its not sufficient:
> >
> > 500ppm of 300 seconds = .0005*300 = 0.15 seconds.
> >
> > Should aim at avoiding time backwards event in the following situation:
> >
> >
> > T1) t1_kvmclock_read = get_nanoseconds();
> > /* NTP correction to kernel clock = 500ppm */
> > /* TSC correction via mult,shift = 0ppm */
> >
> > VM-exit, update kvmclock (or Hyper-V) clock data with
> > new values
> >
> > T2) t2_kvmclock_read = get_nanoseconds();
> > /* NTP correction to kernel clock = 500ppm */
> > /* TSC correction via mult,shift = 500ppm */
> >
> >
> > So should not allow the host clock (or system_timestamp) to diverge
> > from (TSC based calculation) more than the duration of the event:
> >
> > VM-exit, update kvmclock (or Hyper-V) with new data.
> >
> > To avoid t2_kvmclock_read < t1_kvmclock_read
>
> If I don't do rate limiting, that would not be a problem I think.

Correct.

> The
> host timekeeper code should take care of updating the base timestamps
> (TSC and nanoseconds) in a way that doesn't cause a clock-goes-backwards
> event?

Yes.

> I need to check how often the timekeeper updates the parameters.

I'd assume once every tick, the function is called (the notifier).

But you can optimize that away by only updating the TSC frequency
when mult/shift are updated, which should be much rarer.

(Note this issue is also a problem for Linux based kvmclock today).

>
> Paolo