Re: [PATCH] KVM: X86: Fix softlockup when get the current kvmclock timestamp

From: Wanpeng Li
Date: Mon Nov 06 2017 - 05:49:42 EST


2017-11-06 18:21 GMT+08:00 Paolo Bonzini <pbonzini@xxxxxxxxxx>:
> On 06/11/2017 11:06, Wanpeng Li wrote:
>> 2017-11-06 17:29 GMT+08:00 Paolo Bonzini <pbonzini@xxxxxxxxxx>:
>>> On 06/11/2017 01:55, Wanpeng Li wrote:
>>>> This can be reproduced when running kvm-unit-tests/hyperv_stimer.flat and
>>>> cpu-hotplug stress simultaneously. kvm_get_time_scale() takes too long which
>>>> results in softlockup.
>>>
>>> Apart from the pr_debug, kvm_get_time_scale should take less than a
>>> microsecond. The patch is fine, but can you confirm that pr_debug is
>>> the culprit?
>>
>> I can still encounter softlockup after removing the pr_debug.
>
> Is kvm_get_time_scale getting into an infinite loop then? That would be
> the actual bug.

I think so, kvm_get_time_scale almost occupy 100% cpu utilization.

: static void kvm_get_time_scale(uint64_t scaled_hz,
uint64_t base_hz,
: s8 *pshift, u32 *pmultiplier)
: {
8.97 : 21a34: test %rdx,%rbx
0.00 : 21a37: jne 21a86 <kvm_get_time_scale+0xb6>
: uint64_t tps64;
: uint32_t tps32;
:
: tps64 = base_hz;
: scaled64 = scaled_hz;
: while (tps64 > scaled64*2 || tps64 &
0xffffffff00000000ULL) {
28.01 : 21a39: test %r13d,%r13d
: tps64 >>= 1;
0.00 : 21a3c: js 21a86 <kvm_get_time_scale+0xb6>
: shift--;
6.16 : 21a3e: add %r13d,%r13d
: uint64_t tps64;
: uint32_t tps32;
:
: tps64 = base_hz;
: scaled64 = scaled_hz;
: while (tps64 > scaled64*2 || tps64 &
0xffffffff00000000ULL) {
7.22 : 21a41: mov %r13d,%eax
17.86 : 21a44: add $0x1,%r14d
0.00 : 21a48: cmp %rax,%rbx
: tps64 >>= 1;
: shift--;
: }
:
: tps32 = (uint32_t)tps64;
: while (tps32 <= scaled64 || scaled64 &
0xffffffff00000000ULL) {
31.78 : 21a4b: jae 21a34 <kvm_get_time_scale+0x64>
0.00 : 21a4d: mov %r15,%rdi
: while (tps64 > scaled64*2 || tps64 &
0xffffffff00000000ULL) {
: tps64 >>= 1;
: shift--;
: }

Regards,
Wanpeng Li