Re: [PATCH v3 00/22] dyndbg: add support for writing debug logs to trace

From: jim . cromie
Date: Tue Dec 26 2023 - 23:23:34 EST


re-adding gregkh, who fell off the cc's again.
he is one of our upstreams.

On Fri, Dec 22, 2023 at 6:51 PM Łukasz Bartosik <lb@xxxxxxxxxxxx> wrote:
>
> Add support for writing debug logs to trace events and trace instances.
> The rationale behing this feature is to be able to redirect debug logs
> (per each callsite individually) to trace to aid in debugging. The debug
> logs output to trace can be enabled with T flag. Additionally the T flag
> accepts trace destination which can be provided to the T flag after ":".
> The trace destination field is used to determine where debug logs will be
> written.
>
> In order to redirect callsite debug log to a trace instance it has to be
> initialized first with a newly introduced "open" command. For example,
> "usb" trace instance in <debugfs>/tracing/instances/ can be initialized
> by issuing the command:
>
> localhost ~ # echo "open usb" >
> <debugfs>/dynamic_debug/control
>
> If the trace instance <debugfs>/tracing/instances/usb already exists
> then it will be reused otherwise new usbcore instance will be created.
> Maximum 63 trace instances can be "opened". The trace instance name can
> include any of ^\w+ and underscore characters. There is also a special
> name "0" reserved for writing debug logs to trace prdbg and devdbg events.
> The reserved name "0" does not require to be "opened" before use.

theres implication here that p only works when :0.
Is that intended ?
it seems wrong, or at least unnecessary.

In fact, theres no specific reason to exclude +p:trcbuf.mf,
esp if we keep default-trace-buf
since +pTfml could have that meaning implicitly.
We can call it allowed and "unimplemented until proven useful" on +p

+p completely independent of +T is a simplicity,
lets keep that until proven useful otherwise.

>
> To enable writing usbcore module debug logs to the "usb" trace instance
> opened above a user can issue the command:
>
> localhost ~ # echo "module usbcore =pT:usb" >
> <debugfs>/dynamic_debug/control
>
> Please note that output of debug logs to syslog (p flag) and trace (T flag)
> can be independently enabled/disabled for each callsite. Therefore the
> above command will enable writing of debug logs not only to the trace
> instance "usb" but also to syslog.
>
> When trace destination is followed by another flag the next flag has to
> be preeceded with ".", for example
>
> localhost ~ # echo "module usbcore =pT:usb.l" >
> <debugfs>/dynamic_debug/control
>
> To simplify processing trace destination can be omitted when default
> trace destination is set, for example the command
>
> localhost ~ # echo "module usbcore =pTl" >
> <debugfs>/dynamic_debug/control
>
> will use default trace destination. If default trace destination is not
> set the above command will fail. The default trace destination is set
> when a command with any of the flags [-+=] and explicitly provided trace
> destination matches at least one callsite, for example the command
>
> localhost ~ # echo "module usbcore -T:usb" >
> <debugfs>/dynamic_debug/control
>
> will disable trace for all usbcore module callsites (if any was enabled)
> and will set "usb" instance as default trace destination. To enable writing
> thunderbolt module debug logs to the "usb" trace instance as well a user
> can issue the command:
>
> localhost ~ # echo "module thunderbolt =T" >
> <debugfs>/dynamic_debug/control
>
> Since default trace destination was previously set therefore "usb" instance
> name does not have to be explicitly provided.
>
> When no callsite writes debug logs to a trace instance then it can be
> released (its reference counter decrased) with the "close" command.
decreased

> Closing a trace instance make it unavailable for dynamic debug and also
makes
> allows a user to delete such a trace instance at convenient time later
> with rmdir command. For example when "usb" instance is not used anymore
> a user can issue the command:
>
> localhost ~ # echo "close usb" >
> <debugfs>/dynamic_debug/control
> and then to delete it with:
>
> localhost ~ # rmdir <debugfs>/tracing/instances/
>
> To enable writing usbcore module debug logs to trace dyndbg:prdbg and
> dyndbg:devdbg events user can issue the command:
>
> localhost ~ # echo "module usbcore =T:0" >
> <debugfs>/dynamic_debug/control
>
> Then dyndbg trace events can be for example captured with the command:
>
> localhost ~ # trace-cmd start -e dyndbg
>
> And displayed with the command:
>
> localhost ~ # trace-cmd show
>

is there a recipe to show private traces ? prolly worth a mention here.


>
>
> Jim, I made a few changes to the test script dd-selftest.rc you created
> and I also added new test scenarios. You can find the entire patchset
> with the test script changes on top here
> https://chromium.googlesource.com/chromiumos/third_party/kernel/+log/refs/sandbox/ukaszb/dyndbg_trace_v3
> I wonder where would be the proper place to store the test script
> (upstream, your github.com:jimc/linux.git)
> Do you have a suggestion for that ?
>

I got it (your HEAD), renamed and put it in:
tools/testing/selftests/dynamic_debug/
and copied & modified Makefile and config from ../users/
I added your SOB, that seems proper.

nice additions. and I like the colors. maybe they belong in lib.mk,
but thats 4 later.

and tweaked the modules / DUTs to *hopefully* get stable callsite counts
from the tests on them, in virtually any usable config.
DUTs: file init/main, module mm_init, and module test_dynamic_debug (a
config constraint)
The ref-counts are taken from a default virtme-ng -k config on my x86-64 box

I now pass tests up to:

[root@v6 wk-proof]# ./tools/testing/selftests/dynamic_debug/dyndbg_selftest.sh
...
# TEST_PRIVATE_TRACE_6
[ 1009.296557] dyndbg: read 3 bytes from userspace
[ 1009.296876] dyndbg: query 0: <=_> on module: <*>
[ 1009.297171] dyndbg: processed 1 queries, with 1559 matches, 0 errs
[ 1009.311621] dyndbg: read 39 bytes from userspace
[ 1009.312032] dyndbg: query 0: <open
A_bit_lengthy_trace_instance_name> on module: <*>
[ 1009.312569] dyndbg: instance is already opened
name:A_bit_lengthy_trace_instance_name
[ 1009.313063] dyndbg: processed 1 queries, with 0 matches, 1 errs
: ./tools/testing/selftests/dynamic_debug/dyndbg_selftest.sh:454
ddcmd() expected to exit with code 0
Error: 'File exists'

it seems invocation dependent
I suspect 2 possibilities:
1- invoking as shell-script, repeatedly vs vng -e script-name
(one-shot per boot)
2- virtme-ng's minimal-default config has no CONFIG_USB.
this changes post-conditions of the script, affecting preconditions
of the next run.

2 means that we should replace usbcore with some other always *builtin* module.

I chose module params, on the idea that its builtin, but also invoked later,
when modules are modprobed with params, so its callsites can be enabled &
then invoked via do_prints, to test for writes to the private trace-buf.

I've pushed it to lukas-v3.0 at jimc.github.com/linux

Would you look at err on _6, above,
and figure out if its insufficiently robust to test preconditions, or
something else ?
and also:
rename test_private_trace_N() to better names
- and/or comments to intent & pattern of use
more tweaks invited in its commit-msg

since cycle_test_problem() is not actually a problem, whats a better name ?
_not_best_practices ?

I think the final test script enhancement needed is (maybe separately):
private-buffer-expected-write-verification after do_prints,
and around modprobe test_dynamic_debug dyndbg=$multicmd
subject to 2 below.

> TODOs:
> - update dynamic debug documentation,
> - finalize discussion related to default trace destination
> (whether we want it at all, if yes then conclude on the details)
>

good list. before docs;

wrt default-trace, at least in some details, it seems extra rules to know.
for example, if the default were set on successful open,
rather than successful rule application (iirc), then it would read a lot
like "with" or "using" blocks, esp with <<EOX form

with that new form, it might be pretty expressive:

echo <<YMMV > /proc/dynamic_debug/control
open sesame # ie with
module magic_carpet function forward +Tfml
module magic_lamp function rub_vigorously +Tfmt
open voicetrace
module voice2ai function text2cmdprompt +T
module undo function oh_crap +T
YMMV

I think set-default-on-good-rule-applied is flawed:
it relies on the most complex expression in the grammar to be correct
and without spelling errors, where module unknown is
OK grammatically, but nonsense.
set-default-on-open is simpler to explain and test separately.

We should probably consider modifying the continue-looping behavior
of ddebug_exec_queries, depending upon the return-code.
open, close errors could reasonably warrant different treatment,
like terminating the mult-cmd loop prematurely on open/close errs.
Esp if we keep the set-default-on-good-rule-applied
What do you think ?

I'm inclined to reword a commit-msg or 2, drop RFC, obsolete comments etc.

I can do that in a lukas-v3.1, or as in-thread responses,
do you have a preference ?
some inline below..

> Changes:
> V3 -> V2
> - squash "dyndbg: export _print_hex_dump" with "dyndbg: add support for hex_dump output to trace",
> - squash "dyndbg: tweak pr_info format s/trace dest/trace_dest/" with "dyndbg: add processing of T(race) flag argument",
> - squash "dyndbg: change +T:name_terminator to dot" with "dyndbg: add processing of T(race) flag argument",
> - fix setting default trace destination,
> - decrease use count when callsite is being disabled as a part of module removal,
> - fix parsing of T flag argument,
> - update __get_str_strip_nl macro per Steve's comment,
> - drop commit "dyndbg: move lock,unlock into ddebug_change, drop goto" and add comment explaining why ddebug_parse_flags
> and ddebug_change have to be run in critical section,
> - add "depends on TRACING" for "DYNAMIC_DEBUG",
> - update DYNAMIC_DEBUG_BRANCH macro which is used when CONFIG_JUMP_LABEL is not set,
> - change structure name ddebug_trace_inst to dd_private_tracebuf,
> - change structure name ddebug_trace to dd_tracebuf_tbl_info,
> - rename is_ddebug_cmd to is_dd_trace_cmd,
> - rename validate_tr_name to dd_good_trace_name,
> - rename handle_tr_opend_cmd to handle_trace_opend_cmd and handle_tr_close_cmd to handle_trace_close_cmd,
> - refactor ddebug_parse_cmd,
> - add pr_err when trace_array_get_by_name or trace_array_init_printk fails in trace_array_init_printk,
> - add static prefix to show_T_args,
> - rename TRACE_DST_MAX to TRACE_DST_LAST and change its value to 64, reserve index 0 for trace events in dd_tracebuf_tbl_info (index 0 will be wasted),
> - add "#: " prefix to "Default trace destination" and "Opened trace instances" and print all opened trace instances on the same line,
>
> V2->V1
> Major rework after receiving feedback in
> https://lore.kernel.org/all/20231103131011.1316396-1-lb@xxxxxxxxxxxx/
>
> This includes among other things:
> - addittion of open/close commands,
> - use names instead of numbers for trace destinations,
> - change prdbg and devdbg trace events to strip newline
> on the slow path (read side),
> - change prdbg and devdbg trace events to stores actual values
> instead of pointers because if a pointer becomes invalid then
> the TP_printk call will cause a crash,
> - add support for default trace destination.
>
> V1
> Major rework after receiving feedback in
> https://lore.kernel.org/all/20230915154856.1896062-1-lb@xxxxxxxxxxxx/
>
> Jim Cromie (14):
> dyndbg: add _DPRINTK_FLAGS_ENABLED
> dyndbg: add _DPRINTK_FLAGS_TRACE
> dyndbg: add write events to tracefs code
> dyndbg: add 2 trace-events: prdbg, devdbg
> tracefs: add __get_str_strip_nl - RFC
> dyndbg: use __get_str_strip_nl in prdbg and devdbg
> dyndbg: repack _ddebug structure
> dyndbg: disambiguate quoting in a debug msg
> dyndbg: fix old BUG_ON in >control parser
> dyndbg: treat comma as a token separator
> dyndbg: add skip_spaces_and_coma()

probly just squash this with prev, its not separately tested,
and not worth isolating a test for it.
s/coma/comma/ otherwise.

> dyndbg: split multi-query strings with %
> dyndbg: reduce verbose/debug clutter
> dyndbg: id the bad word in parse-flags err msg
>
> Łukasz Bartosik (8):
> dyndbg: move flags field to a new structure
> dyndbg: add trace destination field to _ddebug
> dyndbg: add open and close commands for trace
> dyndbg: don't close trace instance when in use
> dyndbg: add processing of T(race) flag argument

tweak subject s/T(race)/+T:prv_trcbuf_name./ - might as well use
legal flag syntax.


> dyndbg: add support for default trace destination
> dyndbg: write debug logs to trace instance
> dyndbg: add support for hex_dump output to trace
>
> .../admin-guide/dynamic-debug-howto.rst | 5 +-
> MAINTAINERS | 1 +
> include/linux/dynamic_debug.h | 57 +-
> include/trace/events/dyndbg.h | 54 ++
> include/trace/stages/stage3_trace_output.h | 13 +
> lib/Kconfig.debug | 2 +
> lib/dynamic_debug.c | 736 ++++++++++++++++--
> 7 files changed, 775 insertions(+), 93 deletions(-)
> create mode 100644 include/trace/events/dyndbg.h
>
> --
> 2.43.0.472.g3155946c3a-goog
>