Re: [PATCH v2 1/3] KVM: x86: implement KVM_{GET|SET}_TSC_STATE

From: Marcelo Tosatti
Date: Tue Dec 08 2020 - 09:25:19 EST


On Sun, Dec 06, 2020 at 05:19:16PM +0100, Thomas Gleixner wrote:
> On Thu, Dec 03 2020 at 19:11, Maxim Levitsky wrote:
> > + case KVM_SET_TSC_STATE: {
> > + struct kvm_tsc_state __user *user_tsc_state = argp;
> > + struct kvm_tsc_state tsc_state;
> > + u64 host_tsc, wall_nsec;
> > +
> > + u64 new_guest_tsc, new_guest_tsc_offset;
> > +
> > + r = -EFAULT;
> > + if (copy_from_user(&tsc_state, user_tsc_state, sizeof(tsc_state)))
> > + goto out;
> > +
> > + kvm_get_walltime(&wall_nsec, &host_tsc);
> > + new_guest_tsc = tsc_state.tsc;
> > +
> > + if (tsc_state.flags & KVM_TSC_STATE_TIMESTAMP_VALID) {
> > + s64 diff = wall_nsec - tsc_state.nsec;
> > + if (diff >= 0)
> > + new_guest_tsc += nsec_to_cycles(vcpu, diff);
> > + else
> > + new_guest_tsc -= nsec_to_cycles(vcpu, -diff);
> > + }
> > +
> > + new_guest_tsc_offset = new_guest_tsc - kvm_scale_tsc(vcpu, host_tsc);
> > + kvm_vcpu_write_tsc_offset(vcpu, new_guest_tsc_offset);
>
> >From a timekeeping POV and the guests expectation of TSC this is
> fundamentally wrong:
>
> tscguest = scaled(hosttsc) + offset
>
> The TSC has to be viewed systemwide and not per CPU. It's systemwide
> used for timekeeping and for that to work it has to be synchronized.
>
> Why would this be different on virt? Just because it's virt or what?
>
> Migration is a guest wide thing and you're not migrating single vCPUs.
>
> This hackery just papers over he underlying design fail that KVM looks
> at the TSC per vCPU which is the root cause and that needs to be fixed.

It already does it: The unified TSC offset is kept at kvm->arch.cur_tsc_offset.