Re: [PATCH kvm-unit-tests] x86/pmu: Add Intel Guest Transactional (commited) cycles testcase

From: Sean Christopherson
Date: Wed Dec 07 2022 - 12:33:00 EST


On Wed, Dec 07, 2022, Like Xu wrote:
> From: Like Xu <likexu@xxxxxxxxxxx>
>
> On Intel platforms with TSX feature, pmu users in guest can collect
> the commited or total transactional cycles for a tsx-enabled workload,
> adding new test cases to cover them, as they are not strictly the same
> as normal hardware events from the KVM implementation point of view.
>
> Signed-off-by: Like Xu <likexu@xxxxxxxxxxx>
> ---
> x86/pmu.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 72 insertions(+), 1 deletion(-)
>
> diff --git a/x86/pmu.c b/x86/pmu.c
> index 72c2c9c..d4c6813 100644
> --- a/x86/pmu.c
> +++ b/x86/pmu.c
> @@ -20,7 +20,7 @@
>
> typedef struct {
> uint32_t ctr;
> - uint32_t config;
> + uint64_t config;
> uint64_t count;
> int idx;
> } pmu_counter_t;
> @@ -547,6 +547,76 @@ static void check_emulated_instr(void)
> report_prefix_pop();
> }
>
> +#define _XBEGIN_STARTED (~0u)
> +
> +static inline int

This should be "unsigned int". EAX can yield a negative value, e.g. via "XABORT
0xff", which is why the compiler instrinsics use this explicit, unsigned value
(that relies on reserved bits in the abort status).

> _xbegin(void)

These having nothing to do with the PMU, i.e. belong in processor.h.

The naming is also non-stanard, i.e. drop the underscore. I assume you're
trying to match the compiler instrinsics, but that's bound to do more harm than
good. Either use the instrinsics or write code that aligns with KUT's style.
Using instrinsics is probably a bad idea because eventually we'll want to do
something weird, e.g. provide a bogus fallback address. And having to go lookup
gcc/clang documentation is rather annoything.

> +{
> + int ret = _XBEGIN_STARTED;

Newline after declarations.

> + asm volatile(".byte 0xc7,0xf8 ; .long 0" : "+a" (ret) :: "memory");

This is just mean.

unsigned int ret = XBEGIN_STARTED;

asm volatile("xbegin 1f\n\t"
"1:\n\t"
: "+a" (ret) :: "memory");
return ret;

> + return ret;
> +}
> +
> +static inline void _xend(void)
> +{
> + asm volatile(".byte 0x0f,0x01,0xd5" ::: "memory");

Like XBEGIN, use the mnemonic.

> +}
> +
> +int *ptr;

I'm honestly at a loss for words.

> +static void tsx_fault(void)

s/fault/abort. Yes, a fault causes an abort, but no fault is ever observed by
software. Though I don't quite understand why helpers are needed in the first
place.

> +{
> + int value = 0;
> +
> + ptr = NULL;
> + if(_xbegin() == _XBEGIN_STARTED) {

Space after the "if".

> + value++;
> + // causes abort
> + *ptr = value;

/* Generate a non-canonical #GP to trigger ABORT. */
(int *)NONCANONICAL) = 0;

> + _xend();

Why bother with XEND?

> + }
> +}
> +
> +static void tsx_normal(void)
> +{
> + int value = 0;
> +
> + if(_xbegin() == _XBEGIN_STARTED) {
> + value++;

What's the purpose of incrementing an arbitrary value?

> + _xend();

Does this test rely on the region being successfully committed? If so, the test
is guaranteed to be flaky, e.g. due to a host IRQ at the "wrong" time. Assuming
success is not required, please add a comment describing the requirements.

> + }
> +}
> +
> +static void check_tsx_cycles(void)
> +{
> + pmu_counter_t cnt;
> + int i;
> +
> + if (!this_cpu_has(X86_FEATURE_RTM) || !this_cpu_has(X86_FEATURE_HLE))
> + return;
> +
> + report_prefix_push("TSX cycles");
> +
> + for (i = 0; i < pmu.nr_gp_counters; i++) {
> + cnt.ctr = MSR_GP_COUNTERx(i);
> +
> + if (i == 2)
> + /* Transactional cycles commited only on gp counter 2 */
> + cnt.config = EVNTSEL_OS | EVNTSEL_USR | 0x30000003c;
> + else
> + /* Transactional cycles */
> + cnt.config = EVNTSEL_OS | EVNTSEL_USR | 0x10000003c;
> +
> + start_event(&cnt);
> + tsx_fault();
> + tsx_normal();

As a above, why bother with helpers? Can't this just be:


start_event(&cnt);

/* Generate a non-canonical #GP to trigger ABORT. */
if (xbegin() == XBEGIN_STARTED)
*(int *)NONCANONICAL = 0;

/* <comment about what requirements of this code> */
if (xbegin() == XBEGIN_STARTED)
xend();

stop_event(&cnt);

> + stop_event(&cnt);
> +
> + report(cnt.count > 0, "gp cntr-%d", i);
> + }
> +
> + report_prefix_pop();
> +}
> +
> static void check_counters(void)
> {
> if (is_fep_available())
> @@ -559,6 +629,7 @@ static void check_counters(void)
> check_counter_overflow();
> check_gp_counter_cmask();
> check_running_counter_wrmsr();
> + check_tsx_cycles();
> }
>
> static void do_unsupported_width_counter_write(void *index)
> --
> 2.38.1
>