Re: [PATCH] perf/core: fix the bug in the event multiplexing

From: Shijie Huang
Date: Wed Aug 09 2023 - 05:38:10 EST

Next message: Breno Leitao: "Re: [PATCH v2 1/8] net: expose sock_use_custom_sol_socket"
Previous message: OGAWA Hirofumi: "Re: [PATCH v7 05/13] fat: make fat_update_time get its own timestamp"
In reply to: Mark Rutland: "Re: [PATCH] perf/core: fix the bug in the event multiplexing"
Next in thread: Marc Zyngier: "Re: [PATCH] perf/core: fix the bug in the event multiplexing"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Mark,

在 2023/8/9 17:22, Mark Rutland 写道:

On Wed, Aug 09, 2023 at 08:25:07AM +0000, Oliver Upton wrote:

Hi Huang,

On Wed, Aug 09, 2023 at 09:39:53AM +0800, Huang Shijie wrote:

2.) Root cause.
There is only 7 counters in my arm64 platform:
(one cycle counter) + (6 normal counters)

In 1.3 above, we will use 10 event counters.
Since we only have 7 counters, the perf core will trigger
event multiplexing in hrtimer:
merge_sched_in() -->perf_mux_hrtimer_restart() -->
perf_rotate_context().

In the perf_rotate_context(), it does not restore some PMU registers
as context_switch() does. In context_switch():
kvm_sched_in() --> kvm_vcpu_pmu_restore_guest()
kvm_sched_out() --> kvm_vcpu_pmu_restore_host()

So we got wrong result.

This is a rather vague description of the problem. AFAICT, the
issue here is on VHE systems we wind up getting the EL0 count
enable/disable bits backwards when entering the guest, which is
corroborated by the data you have below.

Yep; IIUC the issue here is that when we take an IRQ from a guest and reprogram
the PMU in the IRQ handler, the IRQ handler will program the PMU with
appropriate host/guest/user/etc filters for a *host* context, and then we'll
return back into the guest without reconfigurign the event filtering for a
*guest* context.

Yes.

That can happen for perf_rotate_context(), or when we install an event into a
running context, as that'll happen via an IPI.

+void arch_perf_rotate_pmu_set(void)
+{
+ if (is_guest())
+ kvm_vcpu_pmu_restore_guest(NULL);
+ else
+ kvm_vcpu_pmu_restore_host(NULL);
+}
+

This sort of hook is rather nasty, and I'd strongly prefer a solution
that's confined to KVM. I don't think the !is_guest() branch is
necessary at all. Regardless of how the pmu context is changed, we need
to go through vcpu_put() before getting back out to userspace.

We can check for a running vCPU (ick) from kvm_set_pmu_events() and either
do the EL0 bit flip there or make a request on the vCPU to call
kvm_vcpu_pmu_restore_guest() immediately before reentering the guest.
I'm slightly leaning towards the latter, unless anyone has a better idea
here.

The latter sounds reasonable to me.

okay. I prefer the latter one now. :)

Thanks

Huang Shijie

I suspect we need to take special care here to make sure we leave *all* events
in a good state when re-entering the guest or if we get to kvm_sched_out()
after *removing* an event via an IPI -- it'd be easy to mess either case up and
leave some events in a bad state.

Thanks,
Mark.

Next message: Breno Leitao: "Re: [PATCH v2 1/8] net: expose sock_use_custom_sol_socket"
Previous message: OGAWA Hirofumi: "Re: [PATCH v7 05/13] fat: make fat_update_time get its own timestamp"
In reply to: Mark Rutland: "Re: [PATCH] perf/core: fix the bug in the event multiplexing"
Next in thread: Marc Zyngier: "Re: [PATCH] perf/core: fix the bug in the event multiplexing"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]