Re: [PATCH v5 06/13] KVM: x86/vmx: Save/Restore host MSR_ARCH_LBR_CTL state

From: Jim Mattson
Date: Tue Jul 13 2021 - 13:12:15 EST


On Tue, Jul 13, 2021 at 3:16 AM Like Xu <like.xu.linux@xxxxxxxxx> wrote:
>
> On 13/7/2021 5:47 pm, Yang Weijiang wrote:
> > On Mon, Jul 12, 2021 at 10:23:02AM -0700, Jim Mattson wrote:
> >> On Mon, Jul 12, 2021 at 2:36 AM Yang Weijiang <weijiang.yang@xxxxxxxxx> wrote:
> >>>
> >>> On Fri, Jul 09, 2021 at 03:54:53PM -0700, Jim Mattson wrote:
> >>>> On Fri, Jul 9, 2021 at 2:51 AM Yang Weijiang <weijiang.yang@xxxxxxxxx> wrote:
> >>>>>
> >>>>> If host is using MSR_ARCH_LBR_CTL then save it before vm-entry
> >>>>> and reload it after vm-exit.
> >>>>
> >>>> I don't see anything being done here "before VM-entry" or "after
> >>>> VM-exit." This code seems to be invoked on vcpu_load and vcpu_put.
> >>>>
> >>>> In any case, I don't see why this one MSR is special. It seems that if
> >>>> the host is using the architectural LBR MSRs, then *all* of the host
> >>>> architectural LBR MSRs have to be saved on vcpu_load and restored on
> >>>> vcpu_put. Shouldn't kvm_load_guest_fpu() and kvm_put_guest_fpu() do
> >>>> that via the calls to kvm_save_current_fpu(vcpu->arch.user_fpu) and
> >>>> restore_fpregs_from_fpstate(&vcpu->arch.user_fpu->state)?
> >>> I looked back on the discussion thread:
> >>> https://patchwork.kernel.org/project/kvm/patch/20210303135756.1546253-8-like.xu@xxxxxxxxxxxxxxx/
> >>> not sure why this code is added, but IMO, although fpu save/restore in outer loop
> >>> covers this LBR MSR, but the operation points are far away from vm-entry/exit
> >>> point, i.e., the guest MSR setting could leak to host side for a signicant
> >>> long of time, it may cause host side profiling accuracy. if we save/restore it
> >>> manually, it'll mitigate the issue signifcantly.
> >>
> >> I'll be interested to see how you distinguish the intermingled branch
> >> streams, if you allow the host to record LBRs while the LBR MSRs
> >> contain guest values!
>
> The guest is pretty fine that the real LBR MSRs contain the guest values
> even after vm-exit if there is no other LBR user in the current thread.
>
> (The perf subsystem makes this data visible only to the current thread)
>
> Except for MSR_ARCH_LBR_CTL, we don't want to add msr switch overhead to
> the vmx transaction (just think about {from, to, info} * 32 entries).
>
> If we have other LBR user (such as a "perf kvm") in the current thread,
> the host/guest LBR user will create separate LBR events to compete for
> who can use the LBR in the the current thread.
>
> The final arbiter is the host perf scheduler. The host perf will
> save/restore the contents of the LBR when switching between two
> LBR events.
>
> Indeed, if the LBR hardware is assigned to the host LBR event before
> vm-entry, then the guest LBR feature will be broken and a warning
> will be triggered on the host.

Are you saying that the guest LBR feature only works some of the time?
How are failures communicated to the guest? If this feature doesn't
follow the architectural specification, perhaps you should consider
offering a paravirtual feature instead.

Warnings on the host, by the way, are almost completely useless. How
do I surface such a warning to a customer who has a misbehaving VM? At
the very least, user space should be notified of KVM emulation errors,
so I can get an appropriate message to the customer.

> LBR is the kind of exclusive hardware resource and cannot be shared
> by different host/guest lbr_select configurations.

In that case, it definitely sounds like guest architectural LBRs
should be a paravirtual feature, since you can't actually virtualize
the hardware.

> > I'll check if an inner simplified xsave/restore to guest/host LBR MSRs is meaningful,
> > the worst case is to drop this patch since it's not correct to only enable host lbr ctl
> > while still leaves guest LBR data in the MSRs. Thanks for the reminder!
> >