Re: [PATCH v2 1/3] KVM: x86: implement KVM_{GET|SET}_TSC_STATE

From: Marcelo Tosatti
Date: Wed Dec 09 2020 - 13:56:41 EST


On Tue, Dec 08, 2020 at 10:33:15PM +0100, Thomas Gleixner wrote:
> On Tue, Dec 08 2020 at 15:11, Marcelo Tosatti wrote:
> > On Tue, Dec 08, 2020 at 05:02:07PM +0100, Thomas Gleixner wrote:
> >> On Tue, Dec 08 2020 at 16:50, Maxim Levitsky wrote:
> >> > On Mon, 2020-12-07 at 20:29 -0300, Marcelo Tosatti wrote:
> >> >> > +This ioctl allows to reconstruct the guest's IA32_TSC and TSC_ADJUST value
> >> >> > +from the state obtained in the past by KVM_GET_TSC_STATE on the same vCPU.
> >> >> > +
> >> >> > +If 'KVM_TSC_STATE_TIMESTAMP_VALID' is set in flags,
> >> >> > +KVM will adjust the guest TSC value by the time that passed since the moment
> >> >> > +CLOCK_REALTIME timestamp was saved in the struct and current value of
> >> >> > +CLOCK_REALTIME, and set the guest's TSC to the new value.
> >> >>
> >> >> This introduces the wraparound bug in Linux timekeeping, doesnt it?
> >>
> >> Which bug?
> >
> > max_cycles overflow. Sent a message to Maxim describing it.
>
> Truly helpful. Why the hell did you not talk to me when you ran into
> that the first time?

Because

1) Users wanted CLOCK_BOOTTIME to stop counting while the VM
is paused (so we wanted to stop guest clock when VM is paused anyway).

2) The solution to inject NMIs to the guest seemed overly
complicated.

> >> For one I have no idea which bug you are talking about and if the bug is
> >> caused by the VMM then why would you "fix" it in the guest kernel.
> >
> > 1) Stop guest, save TSC value of cpu-0 = V.
> > 2) Wait for some amount of time = W.
> > 3) Start guest, load TSC value with V+W.
> >
> > Can cause an overflow on Linux timekeeping.
>
> Yes, because you violate the basic assumption which Linux timekeeping
> makes. See the other mail in this thread.
>
> Thanks,
>
> tglx