Re: [PATCH v1 1/2] thunderbolt: add tracefs support to tb_* logging helpers

From: Łukasz Bartosik
Date: Fri Jul 28 2023 - 10:50:31 EST


czw., 27 lip 2023 o 16:47 Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> napisał(a):
>
> On Thu, Jul 27, 2023 at 03:13:25PM +0200, Łukasz Bartosik wrote:
> > Thunderbolt tracing is a lightweight alternative to traditional
> > thunderbolt debug logging. While thunderbolt debug logging is quite
> > convenient when reproducing a specific issue, it doesn't help when
> > something goes wrong unexpectedly. There are a couple of reasons why
> > one does not want to enable thunderbolt debug logging at all times:
> >
> > 1. We don't want to overwhelm kernel log with thunderbolt spam, others
> > want to use it too
>
> But doesn't the dynamic debug infrastructure allow this today?
>

Dynamic debug allows only partially what we would like to achieve.

Our goal is to be able to gather thunderbolt debug logs from ChromeOS
end users when an issue happens. Especially that would be very useful
for us in case of issues which reproduction is difficult. We would
write thunderbolt debug logs to a separate wrap around buffer provided
by trace subsystem. When an issue happens and a user sends a feedback
then thunderbolt logs would be attached to the feedback providing
more insight into what happened.

Dynamic debug sends all debug messages to dmesg and with significant
number of dynamic dev_dbg prints enabled dmesg may be quickly overwritten
and other important logs lost. Also enabling dynamic debug for the
entire kernel might
not be appropriate for production kernels due to impact on kernel size and perf.

> > 2. Console logging is slow
>
> Slow how?
>

I meant slow compared to writing messages to trace array instance in memory.

> > Thunderbolt tracing aims to solve both these problems. Use of
> > the thunderbolt tracefs instance allows to enable thunderbolt
> > logging in production without impacting performance or spamming
> > the system logs.
> >
> > To use thunderbolt tracing, set the thunderbolt.trace module parameter
> > (via cmdline or sysfs) to true:
> > ::
> > eg: echo true > /sys/module/thunderbolt/parameters/trace
>
> Why yet-another module parameter? Why is this required?
>

This is to enable/disable write of thunderbolt debug logs to thunderbolt
trace array instance.


> > Once active, all log messages will be written to the thunderbolt trace.
> > Once at capacity, the trace will overwrite old messages with new ones.
> > At any point, one can read the trace file to extract the previous
> > thunderbolt messages:
> > ::
> > eg: cat /sys/kernel/tracing/instances/thunderbolt/trace
> >
> > The thunderbolt trace instance is subsystem wide, so if you have multiple
> > devices active, they will be adding logs to the same trace.
>
> This just uses the existing logging functionality and ties it into the
> trace functionality, right?
>

Yes, it writes log messages (including debug level) to memory buffer provided
by trace framework.

> If so, why not do this for all printk messages, why should this be
> unique to thunderbolt?
>

I agree it would be better to come up with a general solution that can be
used by any part/subsystem of a kernel. I think it makes sense for me to look
more into dynamic debug and see how it could be extended to cover our use case.

> Normally with tracing, you enable specific tracepoints that you add, not
> just "all dev_*() messages", are you sure this will actually help?
>

Yes, it would be helpful, here is an example snippet from logs
captured into thunderbolt trace array instance:
<...>-177 [000] ..... 217.635100: tb_trace_printf: thunderbolt
0000:00:0d.3: 0: resuming switch
<...>-177 [000] ..... 217.635101: tb_trace_printf: thunderbolt
0000:00:0d.3: restoring Switch at 0x0 (depth: 0, up port: 7)
<...>-177 [000] ..... 217.635400: tb_trace_printf: thunderbolt
0000:00:0d.3: 0:1: USB4 wake: no, connection wake: no, disconnection
wake: yes

Thanks,
Lukasz

> thanks,
>
> greg k-h