Re: [PATCH 2/4] tracing/user_events: Introduce multi-format events

From: Beau Belgrave
Date: Tue Jan 30 2024 - 13:05:40 EST


On Mon, Jan 29, 2024 at 09:24:07PM -0500, Steven Rostedt wrote:
> On Mon, 29 Jan 2024 09:29:07 -0800
> Beau Belgrave <beaub@xxxxxxxxxxxxxxxxxxx> wrote:
>
> > Thanks, yeah ideally we wouldn't use special characters.
> >
> > I'm not picky about this. However, I did want something that clearly
> > allowed a glob pattern to find all versions of a given register name of
> > user_events by user programs that record. The dot notation will pull in
> > more than expected if dotted namespace style names are used.
> >
> > An example is "Asserts" and "Asserts.Verbose" from different programs.
> > If we tried to find all versions of "Asserts" via glob of "Asserts.*" it
> > will pull in "Asserts.Verbose.1" in addition to "Asserts.0".
>
> Do you prevent brackets in names?
>

No. However, since brackets have a start and end token that are distinct
finding all versions of your event is trivial compared to a single dot.

Imagine two events:
Asserts
Asserts[MyCoolIndex]

Resolves to tracepoints of:
Asserts:[0]
Asserts[MyCoolIndex]:[1]

Regardless of brackets in the names, a simple glob of Asserts:\[*\] only
finds Asserts:[0]. This is because we have that end bracket in the glob
and the full event name including the start bracket.

If I register another "version" of Asserts, thne I'll have:
Asserts:[0]
Asserts[MyCoolIndex]:[1]
Asserts:[2]

The glob of Asserts:\[*\] will return both:
Asserts:[0]
Asserts:[2]

At this point the program can either record all versions or scan further
to find which version of Asserts is wanted.

> >
> > While a glob of "Asserts.[0-9]" works when the unique ID is 0-9, it
> > doesn't work if the number is higher, like 128. If we ever decide to
> > change the ID from an integer to say hex to save space, these globs
> > would break.
> >
> > Is there some scheme that fits the C-variable name that addresses the
> > above scenarios? Brackets gave me a simple glob that seemed to prevent a
> > lot of this ("Asserts.\[*\]" in this case).
>
> Prevent a lot of what? I'm not sure what your example here is.
>

I'll try again :)

We have 2 events registered via user_events:
Asserts
Asserts.Verbose

Using dot notation these would result in tracepoints of:
user_events_multi/Asserts.0
user_events_multi/Asserts.Verbose.1

Using bracket notation these would result in tracepoints of:
user_events_multi/Asserts:[0]
user_events_multi/Asserts.Verbose:[1]

A recording program only wants to enable the Asserts tracepoint. It does
not want to record the Asserts.Verbose tracepoint.

The program must find the right tracepoint by scanning tracefs under the
user_events_multi system.

A single dot suffix does not allow a simple glob to be used. The glob
Asserts.* will return both Asserts.0 and Asserts.Verbose.1.

A simple glob of Asserts:\[*\] will only find Asserts:[0], it will not
find Asserts.Verbose:[1].

We could just use brackets and not have the colon (Asserts[0] in this
case). But brackets are still special for bash.

> >
> > Are we confident that we always want to represent the ID as a base-10
> > integer vs a base-16 integer? The suffix will be ABI to ensure recording
> > programs can find their events easily.
>
> Is there a difference to what we choose?
>

If a simple glob of event_name:\[*\] cannot be used, then we must document
what the suffix format is, so an appropriate regex can be created. If we
start with base-10 then later move to base-16 we will break existing regex
patterns on the recording side.

I prefer, and have in this series, a base-16 output since it saves on
the tracepoint name size.

Either way we go, we need to define how recording programs should find
the events they care about. So we must be very clear, IMHO, about the
format of the tracepoint names in our documentation.

I personally think recording programs are likely to get this wrong
without proper guidance.

Thanks,
-Beau

> -- Steve