Re: [BUG] Guest OSes die simultaneously (bisected)

From: Sean Christopherson
Date: Thu Jan 04 2024 - 12:26:29 EST


On Thu, Jan 04, 2024, Paolo Bonzini wrote:
> On 1/4/24 17:06, Paul E. McKenney wrote:
> > Instead, the point I am trying to make is that carefully
> > constructed tests can serve as tireless and accurate code reviewers.
> > This won't ever replace actual code review, but my experience indicates
> > that it will help find more bugs more quickly and more easily.
>
> TBH this (conflict between virtual addresses on the host and the guest
> leading to corruption of the guest) is probably not the kind of adversarial
> test that one would have written or suggested right off the bat.

I disagree. The flaws with PEBS using a virtual address is blatantly obvious to
anyone that has spent any time dealing with the cross-section of PMU and VMX.
Intel even explicitly added "isolation" functionality to ensure PEBS can't overrun
VM-Enter and generate host records in the guest. Not to mention that Intel
specifically addressed the virtual addressing issue in the design of Processor
Trace (PT, a.k.a. RTIT).

In other words, we *knew* exactly what would break *and* there had been breakage
in the past. Chalk it up to messed up priorities, poor test infrastructure, or
anything along those lines. But we shouldn't pretend that this was some obscure
edge case that didn't warrant a dedicated test from the get-go.