Re: [PATCH 1/3] x86, perf: Use a new PMU ack sequence

From: Peter Zijlstra
Date: Wed Oct 21 2015 - 16:36:20 EST


On Wed, Oct 21, 2015 at 01:16:06PM -0700, Andi Kleen wrote:
> From: Andi Kleen <ak@xxxxxxxxxxxxxxx>
>
> The SKL PMU code had a problem with LBR freezing. When a counter
> overflows already in the PMI handler, the LBR would be frozen
> early and not be unfrozen until the next PMI. This means we would
> get stale LBR information.
>
> Depending on the workload this could happen for a few percent
> of the PMIs for cycles in adaptive frequency mode, because the frequency
> algorithm regularly goes down to very low periods.
>
> This patch implements a new PMU ack sequence that avoids this problem.
> The new sequence is:
>
> - (counters are disabled with GLOBAL_CTRL)
> There should be no further increments of the counters by later instructions; and
> thus no additional PMI (and thus no additional freezing).
>
> - ack the APIC
>
> Clear the APIC PMI LVT entry so that any later interrupt is delivered and is
> not lost due to the PMI LVT entry being masked. A lost PMI interrupt could lead to
> LBRs staying frozen without entering the PMI handler
>
> - Ack the PMU counters. This unfreezes the LBRs on Skylake (but not
> on earlier CPUs which rely on DEBUGCTL writes for this)
>
> - Reenable counters
>
> The WRMSR will start the counters counting again (and will be ordered after the
> APIC LVT PMI entry write since WRMSR is architecturally serializing). Since the
> APIC PMI LVT is unmasked, any PMI which is caused by these perfmon counters
> will trigger an NMI (but the delivery may be delayed until after the next
> IRET)
>
> One side effect is that the old retry loop is not possible anymore,
> as the counters stay unacked for the majority of the PMI handler,
> but that is not a big loss, as "profiling" the PMI was always
> a bit dubious. For the old ack sequence it is still supported.
>
> The new sequence is now used unconditionally on all Intel Core/Atom
> CPUs. The old irq loop check is removed. Instead we rely on the
> generic nmi maximum duration check in the NMI code, together with the perf
> enforcement of the maximum sampling rate, to stop any runaway
> counters. We also assume that any counter can be stopped by setting
> a sufficiently large overflow value.
>
> v2:
> Use new ack sequence unconditionally. Remove pmu reset code.

So this is not something we can easily revert if things go bad. Esp.
since you build on it with the next patches.

Also, teach your editor to wrap at 72 chars for Changelogs.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/