Re: [PATCH -tip v3 2/5] perf: change perf_event_header.misc to PERF_RECORD_MISC_USERfor BTS

From: Akihiro Nagai
Date: Tue Aug 16 2011 - 22:19:56 EST


(2011/08/11 21:18), Peter Zijlstra wrote:
On Thu, 2011-08-11 at 21:06 +0900, Akihiro Nagai wrote:
Change perf_event_headder.misc to PERF_RECORD_MISC_USER for
BTS records, because BTS traces both kernel and user spaces
nevertheless perf specifies to trace only kernel or user space.

Now I'm confused..

If BTS traces both kernel and user, the MISC bit should reflect the
right state per-sample, on x86 that's easy enough to do by the address.
Yes.
However, PERF_RECORD_MISC_KERNEL can be specified only when
both from_addr and to_addr are kernel-space. Since current perf always enables
IA32_DEBUGCTL_MSR.BTS_OFF_OS flag when it uses BTS, such BTS records are not
output. So, it's enough to specify only PERF_RECORD_MISC_USER.



---

arch/x86/kernel/cpu/perf_event_intel_ds.c | 8 ++++++++
1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index 1b1ef3a..323f3f0 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -340,6 +340,14 @@ static int intel_pmu_drain_bts_buffer(void)
*/
perf_prepare_sample(&header,&data, event,&regs);

+ /*
+ * Since BTS can not trace kernel and user space separately, set

Uhm, IA32_DEBUGCTL_MSR.BTS_OFF_{OS,USR} seem to suggest it can?!
Current perf always enables IA32_DEBUGCTL_MSR.BTS_OFF_OS when traces with BTS.
However, BTS records branches which jump from kernel(irq_return) to user, because
the msr is to stop tracing in Ring0. It's different with kernel-space in the
strict sense.


+ * PERF_RECORD_MISC_USER in header.misc to resolve both kernel and
+ * user DSOs and symbols.
+ */
+ header.misc&= ~PERF_RECORD_MISC_CPUMODE_MASK;
+ header.misc |= PERF_RECORD_MISC_USER;


So what's wrong with something like:

header.misc |= is_kernel_address(at->from) ?
PERF_RECORD_MISC_KERNEL :
PERF_RECORD_MISC_USER;

It looks good.
However, current perf doesn't output "kernel to kernel" BTS records.
So, it is unnecessary yet.

Thank you.

if (perf_output_begin(&handle, event, header.size * (top - at)))
return 1;





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/