Re: [PATCH RFC 04/10] perf: Introduce deferred user callchains

From: Namhyung Kim
Date: Wed Nov 15 2023 - 11:13:40 EST


On Mon, Nov 13, 2023 at 08:56:39AM -0800, Namhyung Kim wrote:
> On Sat, Nov 11, 2023 at 10:49 AM Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:
> >
> > On Fri, Nov 10, 2023 at 10:57:58PM -0800, Namhyung Kim wrote:
> > > Anyway I'm not sure it can support these additional samples for
> > > deferred callchains without breaking the existing perf tools.
> > > Anyway it doesn't know PERF_CONTEXT_USER_DEFERRED at least.
> > > I think this should be controlled by a new feature bit in the
> > > perf_event_attr.
> > >
> > > Then we might add a breaking change to have a special
> > > sample record for the deferred callchains and sample ID only.
> >
> > Sounds like a good idea. I'll need to study the code to figure out how
> > to do that on the perf tool side. Or would you care to write a patch?
>
> Sure, I'd be happy to write one.

I think we can start with something like the below.
The sample id (attr.sample_type) should have
IDENTIFIER | TID | TIME to enable defer_callchain
in order to match sample and callchain records.

Thanks,
Namhyung


---8<---
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 39c6a250dd1b..a3765ff59798 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -456,7 +456,8 @@ struct perf_event_attr {
inherit_thread : 1, /* children only inherit if cloned with CLONE_THREAD */
remove_on_exec : 1, /* event is removed from task on exec */
sigtrap : 1, /* send synchronous SIGTRAP on event */
- __reserved_1 : 26;
+ defer_callchain: 1, /* generate DEFERRED_CALLCHAINS records for userspace */
+ __reserved_1 : 25;

union {
__u32 wakeup_events; /* wakeup every n events */
@@ -1207,6 +1208,20 @@ enum perf_event_type {
*/
PERF_RECORD_AUX_OUTPUT_HW_ID = 21,

+ /*
+ * Deferred user stack callchains (for SFrame). Previous samples would
+ * have kernel callchains only and they need to be stitched with this
+ * to make full callchains.
+ *
+ * struct {
+ * struct perf_event_header header;
+ * u64 nr;
+ * u64 ips[nr];
+ * struct sample_id sample_id;
+ * };
+ */
+ PERF_RECORD_DEFERRED_CALLCHAINS = 22,
+
PERF_RECORD_MAX, /* non-ABI */
};