Re: [PATCH v4 8/9] KVM: selftests: Test Intel supported fixed counters bit mask

From: Sean Christopherson
Date: Fri Oct 20 2023 - 15:07:02 EST


On Mon, Sep 11, 2023, Jinrong Liang wrote:
> From: Jinrong Liang <cloudliang@xxxxxxxxxxx>
>
> Add a test to check that fixed counters enabled via guest
> CPUID.0xA.ECX (instead of EDX[04:00]) work as normal as usual.
>
> Co-developed-by: Like Xu <likexu@xxxxxxxxxxx>
> Signed-off-by: Like Xu <likexu@xxxxxxxxxxx>
> Signed-off-by: Jinrong Liang <cloudliang@xxxxxxxxxxx>
> ---
> .../selftests/kvm/x86_64/pmu_counters_test.c | 54 +++++++++++++++++++
> 1 file changed, 54 insertions(+)
>
> diff --git a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
> index df76f0f2bfd0..12c00bf94683 100644
> --- a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
> +++ b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
> @@ -301,6 +301,59 @@ static void test_intel_counters_num(void)
> test_oob_fixed_ctr(nr_fixed_counters + 1);
> }
>
> +static void fixed_counters_guest_code(void)
> +{
> + uint64_t supported_bitmask = this_cpu_property(X86_PROPERTY_PMU_FIXED_COUNTERS_BITMASK);
> + uint32_t nr_fixed_counter = this_cpu_property(X86_PROPERTY_PMU_NR_FIXED_COUNTERS);
> + uint64_t msr_val;
> + unsigned int i;
> + bool expected;
> +
> + for (i = 0; i < nr_fixed_counter; i++) {
> + expected = supported_bitmask & BIT_ULL(i) || i < nr_fixed_counter;
> +
> + wrmsr_safe(MSR_CORE_PERF_FIXED_CTR0 + i, 0);
> + wrmsr_safe(MSR_CORE_PERF_FIXED_CTR_CTRL, BIT_ULL(4 * i));
> + wrmsr_safe(MSR_CORE_PERF_GLOBAL_CTRL, BIT_ULL(PMC_IDX_FIXED + i));
> + __asm__ __volatile__("loop ." : "+c"((int){NUM_BRANCHES}));
> + wrmsr_safe(MSR_CORE_PERF_GLOBAL_CTRL, 0);
> + rdmsr_safe(MSR_CORE_PERF_FIXED_CTR0 + i, &msr_val);

Y'all are making this way harder than it needs to be. The previous patch already
created a testcase to verify fixed counters, just use that! Then test case verify
that trying to enable unsupported fixed counters results in #GP, as opposed to the
above which doesn't do any actual checking, e.g. KVM could completely botch the
{RD,WR}MSR emulation but pass the test by not programming up a counter in perf.

I.e. rather than have a separate test for the supported bitmask goofiness, have
the fixed counters test iterate over the bitmask. And then add a patch to verify
the counters can be enabled and actually count.

And peeking ahead at the vPMU version test, it's the exact same story there.
Instead of hardcoding one-off tests, iterate on the version. The end result is
that the test provides _more_ coverage with _less_ code. And without any of the
hardcoded magic that takes a crystal ball to understand.

*sigh*

And even more importantly, this test is complete garbage. The SDM clearly states
that

With Architectural Performance Monitoring Version 5, register CPUID.0AH.ECX
indicates Fixed Counter enumeration. It is a bit mask which enumerates the
supported Fixed Counters in a processor. If bit 'i' is set, it implies that
Fixed Counter 'i' is supported.

*sigh*

The test passes because it only iterates over counters < nr_fixed_counter. So
as written, the test worse than useless. It provides no meaningful value and is
actively misleading.

for (i = 0; i < nr_fixed_counter; i++) {

Maybe I haven't been explicit enough: the point of writing tests is to find and
prevent bugs, not to get the tests passing. That isn't to say we don't want a
clean testgrid, but writing a "test" that doesn't actually test anything is a
waste of everyone's time.

I appreciate that the PMU is subtle and complex (understatement), but things like
this, where observing that the result of "supported_bitmask & BIT_ULL(i)" doesn't
actually affect anything, doesn't require PMU knowledge.

for (i = 0; i < nr_fixed_counter; i++) {
expected = supported_bitmask & BIT_ULL(i) || i < nr_fixed_counter;

A concrete suggestion for writing tests: introduce bugs in what you're testing
and verify that the test actually detects the bugs. If you tried to do that for
the above bitmask test you would have discovered you can't break KVM because KVM
doesn't support this! And if your test doesn't detect the bugs, that should also
be a big clue that something isn't quite right.