Re: [PATCH v2 1/3] KVM: x86: implement KVM_{GET|SET}_TSC_STATE

From: Andy Lutomirski
Date: Thu Dec 10 2020 - 18:21:13 EST



> On Dec 10, 2020, at 2:28 PM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
> On Thu, Dec 10 2020 at 14:01, Andy Lutomirski wrote:
>>> On Thu, Dec 10, 2020 at 1:25 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>>> I'm still convinced that a notification about 'we take a nap' will be
>>> more robust, less complex and more trivial to backport.
>>
>> What do you have in mind? Suppose the host kernel sends the guest an
>> interrupt on all vCPUs saying "I'm about to take a nap". What happens
>> if the guest is busy with IRQs off for a little bit? Does the host
>> guarantee the guest a certain about of time to try to get the
>> interrupt delivered before allowing the host to enter S3? How about
>> if the host wants to reboot for a security fix -- how long is a guest
>> allowed to delay the process?
>>
>> I'm sure this can all be made to work 99% of time, but I'm a bit
>> concerned about that last 1%.
>
> Seriously?
>
> If the guest has interrupts disabled for ages, i.e. it went for out for
> lunch on its own, then surely the hypervisor can just pull the plug and
> wreckage it. It's like you hit the reset button or pull the powerplug of
> the machine which is not responding anymore.
>
> Reboot waits already today for guests to shut down/hibernate/supsend or
> whatever they are supposed to do. systemd sits there and waits for
> minutes until it decides to kill them. Just crash a guest kernel and
> forget to reset or force power off the guest before you reboot the
> host. Twiddle thumbs for a while and watch the incomprehensible time
> display.
>
> If your security fix reboot is so urgent that it can't wait then just
> pull the plug and be done with it, i.e. kill the guest which makes it
> start from a known state which is a gazillion times better than bringing
> it into a state which it can't handle anymore.
>
> Again, that's not any different than hitting the reset button on the
> host or pulling and reinserting the host powerplug which you would do
> anyway in an emergency case.
>
> Can we please focus on real problems instead of making up new ones?
>
> Correctness of time is a real problem despite the believe of virt folks
> that it can be ignored or duct taped to death.
>

I’m fine with this as long as it’s intentional. If we say “guest timekeeping across host suspend is correct because we notify the guest”, then we have a hole. But if we say “the host will try to notify the guest, and if the guest is out to lunch then the host reserves the right to suspend without waiting, and the guest should deal with this”, then okay.