Re: [PATCH 2/4] tracing/user_events: Introduce multi-format events

From: Steven Rostedt
Date: Tue Jan 30 2024 - 13:52:26 EST


On Tue, 30 Jan 2024 10:05:15 -0800
Beau Belgrave <beaub@xxxxxxxxxxxxxxxxxxx> wrote:

> On Mon, Jan 29, 2024 at 09:24:07PM -0500, Steven Rostedt wrote:
> > On Mon, 29 Jan 2024 09:29:07 -0800
> > Beau Belgrave <beaub@xxxxxxxxxxxxxxxxxxx> wrote:
> >
> > > Thanks, yeah ideally we wouldn't use special characters.
> > >
> > > I'm not picky about this. However, I did want something that clearly
> > > allowed a glob pattern to find all versions of a given register name of
> > > user_events by user programs that record. The dot notation will pull in
> > > more than expected if dotted namespace style names are used.
> > >
> > > An example is "Asserts" and "Asserts.Verbose" from different programs.
> > > If we tried to find all versions of "Asserts" via glob of "Asserts.*" it
> > > will pull in "Asserts.Verbose.1" in addition to "Asserts.0".
> >
> > Do you prevent brackets in names?
> >
>
> No. However, since brackets have a start and end token that are distinct
> finding all versions of your event is trivial compared to a single dot.
>
> Imagine two events:
> Asserts
> Asserts[MyCoolIndex]
>
> Resolves to tracepoints of:
> Asserts:[0]
> Asserts[MyCoolIndex]:[1]
>
> Regardless of brackets in the names, a simple glob of Asserts:\[*\] only
> finds Asserts:[0]. This is because we have that end bracket in the glob
> and the full event name including the start bracket.
>
> If I register another "version" of Asserts, thne I'll have:
> Asserts:[0]
> Asserts[MyCoolIndex]:[1]
> Asserts:[2]
>
> The glob of Asserts:\[*\] will return both:
> Asserts:[0]
> Asserts:[2]

But what if you had registered "Asserts:[MyCoolIndex]:[1]"

Do you prevent colons?

>
> At this point the program can either record all versions or scan further
> to find which version of Asserts is wanted.
>
> > >
> > > While a glob of "Asserts.[0-9]" works when the unique ID is 0-9, it
> > > doesn't work if the number is higher, like 128. If we ever decide to
> > > change the ID from an integer to say hex to save space, these globs
> > > would break.
> > >
> > > Is there some scheme that fits the C-variable name that addresses the
> > > above scenarios? Brackets gave me a simple glob that seemed to prevent a
> > > lot of this ("Asserts.\[*\]" in this case).
> >
> > Prevent a lot of what? I'm not sure what your example here is.
> >
>
> I'll try again :)
>
> We have 2 events registered via user_events:
> Asserts
> Asserts.Verbose
>
> Using dot notation these would result in tracepoints of:
> user_events_multi/Asserts.0
> user_events_multi/Asserts.Verbose.1
>
> Using bracket notation these would result in tracepoints of:
> user_events_multi/Asserts:[0]
> user_events_multi/Asserts.Verbose:[1]
>
> A recording program only wants to enable the Asserts tracepoint. It does
> not want to record the Asserts.Verbose tracepoint.
>
> The program must find the right tracepoint by scanning tracefs under the
> user_events_multi system.
>
> A single dot suffix does not allow a simple glob to be used. The glob
> Asserts.* will return both Asserts.0 and Asserts.Verbose.1.
>
> A simple glob of Asserts:\[*\] will only find Asserts:[0], it will not
> find Asserts.Verbose:[1].
>
> We could just use brackets and not have the colon (Asserts[0] in this
> case). But brackets are still special for bash.

Are these shell scripts or programs. I use regex in programs all the time.
And if you have shell scripts, use awk or something.

Unless you prevent something from being added, I don't see the protection.

>
> > >
> > > Are we confident that we always want to represent the ID as a base-10
> > > integer vs a base-16 integer? The suffix will be ABI to ensure recording
> > > programs can find their events easily.
> >
> > Is there a difference to what we choose?
> >
>
> If a simple glob of event_name:\[*\] cannot be used, then we must document
> what the suffix format is, so an appropriate regex can be created. If we
> start with base-10 then later move to base-16 we will break existing regex
> patterns on the recording side.
>
> I prefer, and have in this series, a base-16 output since it saves on
> the tracepoint name size.

I honestly don't care which base you use. So if you want to use base 16,
I'm fine with that.

>
> Either way we go, we need to define how recording programs should find
> the events they care about. So we must be very clear, IMHO, about the
> format of the tracepoint names in our documentation.
>
> I personally think recording programs are likely to get this wrong
> without proper guidance.
>

Agreed.

-- Steve