Re: [PATCH] Binder: Add timestamp and async from pid/tid to transaction record

From: Carlos Llamas
Date: Thu Apr 20 2023 - 10:28:46 EST


On Mon, Apr 17, 2023 at 03:14:55PM +0800, Chuang Zhang wrote:
>
> [chuang] Android's ANR and Watchdog problems are often caused by calling
> Binder's server interface and waiting synchronously for too long. In order
> to
> confirm this root-casue, we need to let system_server read the relevant
> nodes
> of binderfs to obtain the transmission when the above failure occurs
> information,
> including of course the time-consuming of the transmission.
> He will help many Android application and system engineers to quickly
> analyze related faults.
> Because we need to obtain time-consuming information in real time when ANR
> or Watchdog occurs, this happens more when consumers use it, and they
> cannot effectively capture atrace, so Perfetto cannot be applied.

Fair enough, this sounds good to me then.

>
> [chuang]As you can see below, in fact, we only need to print the PID and
> TID
> of "from" when printing binder transaction records in
> print_binder_transaction_ilocked,
> which can be printed correctly regardless of whether it is asynchronous or
> synchronous.
> It is just because the PID and TID of "from" can be obtained through
> t->from in
> synchronous mode, while t->from in asynchronous mode cannot be obtained
> because it is not populated.
> So can I directly add new variables from_pid and from_tid to record all
> transmissions? Does it matter if the naming includes the pid? Greg has
> expressed some concerns about this before .

Right, lets populate t->from_pid and t->from_tid for all transactions
and use those only in print_binder_transaction() instead of t->from.
As Greg mentioned, please add these in a separate commit.

About the namespaces, Greg's comments seem accurate. Binder is using the
task->pid directly so the PIDs would not match those seen by the
namespace. This points to a larger issue in binder as we log these
"raw" pid values everywhere. I'll look further into this and come up
with a binder global fix in a separate commit, so just ignore for now.

Thanks,
--
Carlos Llamas