Re: [PATCH V2 1/5] ara virt interface of perf to support kvm guestos statistics collection in guest os

From: Zhang, Yanmin
Date: Tue Jun 22 2010 - 03:47:40 EST


On Tue, 2010-06-22 at 09:14 +0200, Jes Sorensen wrote:
> On 06/22/10 03:49, Zhang, Yanmin wrote:
> > On Mon, 2010-06-21 at 14:45 +0300, Avi Kivity wrote:
> >> Since the guest can use NMI to read the
> >> counter, it should have the highest possible priority, and thus it
> >> shouldn't see any overflow unless it configured the threshold really low.
> >>
> >> If we drop overflow, we can use the RDPMC instruction instead of
> >> KVM_PERF_OP_READ. This allows the guest to allow userspace to read a
> >> counter, or prevent userspace from reading the counter, by setting cr4.pce.
> > 1) para virt perf interface is to hide PMU hardware in host os. Guest os shouldn't
> > access PMU hardware directly. We could expose PMU hardware to guest os directly, but
> > that would be another guest os PMU support method. It shouldn't be a part of para virt
> > interface.
> > 2) Consider below scenario: PMU counter overflows and NMI causes guest os vmexit to
> > host kernel. Host kernel schedules the vcpu thread to another physical cpu before
> > vmenter the guest os again. So later on, guest os just RDPMC the counter on another
> > cpu.
> >
> > So I think above discussion is around how to expose PMU hardware to guest os. I will
> > also check this method after the para virt interface is done.
>
> You should be able to expose the counters as read-only to the guest. KVM
> allows you to specify whether or not a guest has read, write or
> read/write access. If you allowed read access of the counters that would
> safe a fair bit of hyper calls.
Thanks. KVM is good in register access permission configuration. But things are not so
simple like that if we consider real running environment. Host kernel might schedule
guest os vcpu thread to other cpus, or other non-kvm processes might preempt the vcpu
thread on this cpu.

To support such capability you said, we have to implement the direct exposition of PMU
hardware to guest os eventually.

>
> Question is if it is safe to drop overflow support?
Not safe. One of PMU hardware design objectives is to use interrupt or NMI to notify
software when event counter overflows. Without overflow support, software need poll
the PMU registers looply. That is not good and consumes more cpu resources.

Besides the para virt perf interface, I'm also considering the direct exposition
of PMU hardware to guest os. But that will be another very different implementation. We
should not combine it with pv interface. Perhaps our target is to implement both, so
unmodified guest os could get support on perf statistics.

Yanmin


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/