Re: [PATCH] KVM: x86: vPMU: truncate counter value to allowed width

From: Sean Christopherson
Date: Mon Jun 05 2023 - 20:51:58 EST


On Thu, May 04, 2023, Roman Kagan wrote:
> diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
> index 5c7bbf03b599..6a91e1afef5a 100644
> --- a/arch/x86/kvm/pmu.h
> +++ b/arch/x86/kvm/pmu.h
> @@ -60,6 +60,12 @@ static inline u64 pmc_read_counter(struct kvm_pmc *pmc)
> return counter & pmc_bitmask(pmc);
> }
>
> +static inline void pmc_set_counter(struct kvm_pmc *pmc, u64 val)
> +{
> + pmc->counter += val - pmc_read_counter(pmc);

Ugh, not your code, but I don't see how the current code can possibly be correct.

The above unpacks to

counter = pmc->counter;
if (pmc->perf_event && !pmc->is_paused)
counter += perf_event_read_value(pmc->perf_event,
&enabled, &running);
pmc->counter += val - (counter & pmc_bitmask(pmc));

which distills down to

counter = 0;
if (pmc->perf_event && !pmc->is_paused)
counter += perf_event_read_value(pmc->perf_event,
&enabled, &running);
pmc->counter = val - (counter & pmc_bitmask(pmc));

or more succinctly

if (pmc->perf_event && !pmc->is_paused)
val -= perf_event_read_value(pmc->perf_event, &enabled, &running);

pmc->counter = val;

which is obviously wrong. E.g. if the guest writes '0' to an active counter, the
adjustment will cause pmc->counter to be loaded with a large (in unsigned terms)
value, and thus quickly overflow after a write of '0'.

I assume the intent it to purge any accumulated counts that occurred since the
last read, but *before* the write, but then shouldn't this just be:

/* Purge any events that were acculumated before the write. */
if (pmc->perf_event && !pmc->is_paused)
(void)perf_event_read_value(pmc->perf_event, &enabled, &running);

pmc->counter = val & pmc_bitmask(pmc);

Am I missing something?

I'd like to get this sorted out before applying this patch, because I really want
to document what on earth this new helper is doing. Seeing a bizarre partial
RMW operation in a helper with "set" as the verb is super weird.