Re: [PATCH v4 04/10] KVM/x86: intel_pmu_lbr_enable

From: Wei Wang
Date: Fri Jan 04 2019 - 05:03:55 EST


On 01/03/2019 11:34 PM, Jim Mattson wrote:
On Wed, Jan 2, 2019 at 11:16 PM Wei Wang <wei.w.wang@xxxxxxxxx> wrote:
On 01/03/2019 07:26 AM, Jim Mattson wrote:
On Wed, Dec 26, 2018 at 2:01 AM Wei Wang <wei.w.wang@xxxxxxxxx> wrote:
The lbr stack is architecturally specific, for example, SKX has 32 lbr
stack entries while HSW has 16 entries, so a HSW guest running on a SKX
machine may not get accurate perf results. Currently, we forbid the
guest lbr enabling when the guest and host see different lbr stack
entries.
How do you handle live migration?
This feature is gated by the QEMU "lbr=true" option.
So if the lbr fails to work on the destination machine,
the destination side QEMU wouldn't be able to boot,
and migration will not happen.
Yes, but then what happens?

Fast forward to, say, 2021. You're decommissioning all Broadwell
servers in your data center. You have to migrate the running VMs off
of those Broadwell systems onto newer hardware. But, with the current
implementation, the migration cannot happen. So, what do you do? I
suppose you just never enable the feature in the first place. Right?

I'm not sure if that's the way people would do with their data centers.
What would be the point of decommissioning all the BDW machines when
there are important BDW VMs running?

The "lbr=true" option can also be disabled via QMP, which will disable the
kvm side lbr support. So if you really want to deal with the above case,
you could first disable the lbr feature on the source side, and then boot the
destination side QEMU without "lbr=true". The lbr feature will not be available
to use by the guest at the time you decide to migrate the guest to a
non-compatible physical machine.

The point of this patch is: If we couldn't offer our users accurate
lbr results, we'd better to have the feature disabled rather than
offering wrong results to confuse them.


Best,
Wei