Re: [PATCH V2 1/1] kvm/vmx: Add a tracepoint write_tsc_offset

From: Marcelo Tosatti
Date: Fri Jun 07 2013 - 16:42:22 EST


On Thu, Jun 06, 2013 at 02:33:06PM +0300, Gleb Natapov wrote:
> On Wed, Jun 05, 2013 at 09:23:22PM -0300, Marcelo Tosatti wrote:
> > On Tue, Jun 04, 2013 at 05:36:19PM +0900, Yoshihiro YUNOMAE wrote:
> > > Add a tracepoint write_tsc_offset for tracing TSC offset change.
> > > We want to merge ftrace's trace data of guest OSs and the host OS using
> > > TSC for timestamp in chronological order. We need "TSC offset" values for
> > > each guest when merge those because the TSC value on a guest is always the
> > > host TSC plus guest's TSC offset. If we get the TSC offset values, we can
> > > calculate the host TSC value for each guest events from the TSC offset and
> > > the event TSC value. The host TSC values of the guest events are used when we
> > > want to merge trace data of guests and the host in chronological order.
> > > (Note: the trace_clock of both the host and the guest must be set x86-tsc in
> > > this case)
> > >
> > > TSC offset is stored in the VMCS by vmx_write_tsc_offset() or
> > > vmx_adjust_tsc_offset(). KVM executes the former function when a guest boots.
> > > The latter function is executed when kvm clock is updated. Only host can read
> > > TSC offset value from VMCS, so a host needs to output TSC offset value
> > > when TSC offset is changed.
> > >
> > > Since the TSC offset is not often changed, it could be overwritten by other
> > > frequent events while tracing. To avoid that, I recommend to use a special
> > > instance for getting this event:
> > >
> > > 1. set a instance before booting a guest
> > > # cd /sys/kernel/debug/tracing/instances
> > > # mkdir tsc_offset
> > > # cd tsc_offset
> > > # echo x86-tsc > trace_clock
> > > # echo 1 > events/kvm/kvm_write_tsc_offset/enable
> > >
> > > 2. boot a guest
> > >
> > > Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@xxxxxxxxxxx>
> > > Cc: Marcelo Tosatti <mtosatti@xxxxxxxxxx>
> > > Cc: Gleb Natapov <gleb@xxxxxxxxxx>
> > > Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > > Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> > > Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
> > > ---
> > > arch/x86/kvm/trace.h | 18 ++++++++++++++++++
> > > arch/x86/kvm/vmx.c | 3 +++
> > > arch/x86/kvm/x86.c | 1 +
> > > 3 files changed, 22 insertions(+)
> > >
> > > diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
> > > index fe5e00e..9c22e39 100644
> > > --- a/arch/x86/kvm/trace.h
> > > +++ b/arch/x86/kvm/trace.h
> > > @@ -815,6 +815,24 @@ TRACE_EVENT(kvm_track_tsc,
> > > __print_symbolic(__entry->host_clock, host_clocks))
> > > );
> > >
> > > +TRACE_EVENT(kvm_write_tsc_offset,
> > > + TP_PROTO(__u64 previous_tsc_offset, __u64 next_tsc_offset),
> > > + TP_ARGS(previous_tsc_offset, next_tsc_offset),
> > > +
> > > + TP_STRUCT__entry(
> > > + __field( __u64, previous_tsc_offset )
> > > + __field( __u64, next_tsc_offset )
> > > + ),
> > > +
> > > + TP_fast_assign(
> > > + __entry->previous_tsc_offset = previous_tsc_offset;
> > > + __entry->next_tsc_offset = next_tsc_offset;
> > > + ),
> > > +
> > > + TP_printk("previous=%llu next=%llu",
> > > + __entry->previous_tsc_offset, __entry->next_tsc_offset)
> > > +);
> > > +
> >
> > Yoshihiro YUNOMAE,
> >
> > 1) Why is previous_tsc_offset necessary?
> >
> > 2) The TSC offset traces should include vcpu number, so that its
> > possible to correlate traces of SMP guests (the tool should use
> > the individual vcpu tsc offsets when converting guests trace).
> >
> Why PID is not enough? No other trace, except kvm_entry, outputs vcpu id.

Guest trace contains CPU ID.

If PID is exported it is necessary to perform an additional PID->VCPU
translation, which might not be available by the time the trace is
correlated by the tool (because guest is down).

So, PID is equivalent to VCPU as long as translation can be performed.

What is the preference?



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/