Re: [PATCH v13 3/3] binder: add transaction latency tracer

From: Greg Kroah-Hartman
Date: Wed Nov 11 2020 - 10:11:22 EST


On Wed, Nov 11, 2020 at 11:03:06PM +0800, Frankie Chang wrote:
> On Wed, 2020-11-11 at 08:34 +0100, Greg Kroah-Hartman wrote:
> > > - The reason why printing the related information to
> > > kernel information log but not trace buffer is that
> > > some abnormal transactions may be pending for a long
> > > time ago, they could not be recorded due to buffer
> > > limited.
> >
> > Don't abuse the kernel information log for stuff that is just normal
> > operations. What is wrong with using the trace buffers here? That's
> > what they are designed for from what I can tell.
> >
> As mentioned before, time limitation of recording is the reason why we
> don't just use trace here.

What limitation?

> In some long time stability test, such as MTBF,

What is "MTBF"?

> the exception is caused by a series of transactions interaction.
> Some abnormal transactions may be pending for a long time ago, they
> could not be recorded due to buffer limited.

How long of a time is this? If they are pending, only when the timeout
happens is the trace logged, right?

Again, please do not abuse the kernel log for this, that is not what it
is for.

> > > +config BINDER_TRANSACTION_LATENCY_TRACKING
> > > + tristate "Android Binder transaction tracking"
> > > + help
> > > + Used for track abnormal binder transaction which is over threshold,
> > > + when the transaction is done or be free, this transaction would be
> > > + checked whether it executed overtime.
> > > + If yes, printing out the detailed info.
> >
> > Why is this a separate module? Who will ever want this split out?
> >
> The reason we split out a separate module is that we adopted the
> previously discussed recommendations in PATCH v1.
>
> This way all of this tracing code is in-kernel but outside of binder.c.

Putting it in a single file is fine, but what does this benifit doing it
in a separate file? Doesn't it waste more codespace this way?

> > > +/*
> > > + * The reason setting the binder_txn_latency_threshold to 2 sec
> > > + * is that most of timeout abort is greater or equal to 2 sec.
> > > + * Making it configurable to let all users determine which
> > > + * threshold is more suitable.
> > > + */
> > > +static uint32_t binder_txn_latency_threshold = 2;
> > > +module_param_named(threshold, binder_txn_latency_threshold,
> > > + uint, 0644);
> >
> > Again, this isn't the 1990's, please do not add module parameters if at
> > all possible.
> >
>
> Is any recommended method here?
> Because we refer to the method in binder.c, we don't know if this method
> is not suitable.

Look at the individual binder instances. That is what trace should be
on/off for, not for all binder instances in the system at the same time.

thanks,

greg k-h