Re: [PATCH 00/30] tracing: 'hist' triggers

From: Namhyung Kim
Date: Sun Nov 29 2015 - 10:22:46 EST


On Mon, Nov 23, 2015 at 01:51:15PM -0600, Tom Zanussi wrote:
> This is v12 of the 'hist triggers' patchset, incorporating suggestions
> and fixing problems pointed out by Namhyung.

Thanks for doing this work! This version looks good to me although I
have a small nitpick in the 10/30. With that fixed, you can add my

Reviewed-by: Namhyung Kim <namhyung@xxxxxxxxxx>

to this patchset.

Thanks,
Namhyung


>
> Changes from v11:
>
> - Added tracing_map_lookup().
> - Changed tracing_map_insert() to incorporate the hits/drops counter
> updating. The changes allow clients to continue calling
> tracing_map_insert() and update existing entries if desired, or
> stop on the first drop, since the return value is still NULL in
> that case.
> - Changed the hist trigger code to continue calling
> tracing_map_insert() rather than stopping on the first drop. The
> idea is that there's no harm in continuing after the first drop -
> if the user can't tolerate any drops, they'll still see a non-zero
> drop count and can retry with a larger table, but if the user
> doesn't care, the hit and drop counts are an accurate reflection
> of how many events were tallied vs dropped.
> - Added a new unreg_all() callback to event_command, and code to
> call it when the trigger file is opened for truncate. Implemented
> unreg_all() for hist triggers, which allows them to honor the
> truncate flag while leaving existing triggers intact. This means
> that '>>' should now be used for adding multiple hist triggers as
> well as to clear/pause/cont an existing hist trigger.
> - Included Namhyung's log2 patch.
> - Fixed one of Masami's test cases which now requires using '>>' due
> to the truncate changes.
> - 'cont' and 'clear' no longer create a trigger if one doesn't exist.
> - Rebased to latest trace/for-next.
> - Various other smaller fixes and Documentation updates as noted by
> Namhyung.
>
> Changes from v10:
>
> - Rebased to latest trace/for-next.
> - Added Masami's ftrace/hist triggers kselftest patches.
> - Added Masami's Tested-by: to the series.
>
> Changes from v9:
>
> v10 adds some major new features - 'named' hist triggers and support
> for multiple hist triggers on a single event - basically all of the
> features and suggestions suggested by Masami, except for one: I
> decided after all not to implement the 'pause/continue/clear' commands
> on the hist file. I've gotten so used to the habit of calling up the
> previous trigger-setting command in the command history and simply
> appending a 'pause/cont/clear' command to it (and later removing it by
> prepending a '!') that making the user change the command doesn't seem
> as naturally usable. Masami, if you disagree, let me know and I'll
> reconsider it but at this point I'd rather not make that change.
>
> In addition, I think I've fixed all the other various smaller problems
> mentioned in the previous review, please let me know if I missed any...
>
> - Added support for 'named' hist triggers
> - Added support for multiple hist triggers on a single trace event
> - Added documentation and examples for the above new features
> - Streamlined the README to only include the essentials for hist triggers
> - Fixed the 'hex' modifier for values, which was previously ignored
> - Replaced the kmalloc of large arrays with a simple page-based scheme
> - Fixed the tracing_map_insert() case where jhash returns 0
> - Fixed various other small problems noted along the way
>
> Changes from v8:
>
> Same as v8, but with the RFC patch [ftrace: Add function_hist tracer]
> removed, and rebased to latest trace/for-next.
>
> Changes from v7:
>
> This version refactors the commits as suggested by Masami. There are
> now more commits, but the result should be much more reviewable. The
> ending code is the same as before, modulo a couple minor bug fixes I
> discovered while refactoring and testing.
>
> I've also reviewed and fixed a number of shortcomings and errors in
> the comments, and have added a new discussion of the tracing_map data
> structures after Steve mentioned he found them confusing and/or
> insufficiently documented.
>
> Also, I kept Namhyung's string patch [tracing: Support string type key
> properly] as submitted, but added a follow-on patch that refactors it
> and fixes a problem I found with it that enabled static string keys to
> contain random chars and therefore incorrect map insertions.
>
> Changes from v6:
>
> This version adds a new 'sym-offset' modifier as requested by Masami.
> I implemented it as a modifier rather than using the trace option as
> suggested, in part because I wanted to keep it all self-contained and
> it seemed more consistent to just add it alongside the 'sym' modifier.
> Also, hist triggers arent't really a tracer and therefore don't
> directly tie into the option update/callback mechanism so making use
> of it isn't as simple as a normal tracer.
>
> I also changed the sort key specification to be stricter and signal an
> error if the specified sort key wasn't found (rather than defaulting
> to hitcount in those cases), also suggested by Masami. Thanks,
> Masami, for your input!
>
> Also updated the Documentation and tracing/README to reflect the
> changes.
>
> Changes from v5:
>
> This version adds support for compound keys, along with the related
> ability to sort using primary and secondary keys. This was mentioned
> in previous versions as the last important piece that remained
> unimplemented, and is now implemented. (I didn't have time to get to
> the couple of enhancements suggested by Masami, but I expect to be
> able to add those later on top of these.)
>
> Because we now support compound keys and it's not immediately clear in
> the output exactly which fields correspond to keys, the key(s),
> compound or not, are now enclosed by curly braces.
>
> The Documentation and README have been updated to reflect the changes,
> and several new examples have been added to illustrate how to use
> compound keys.
>
> Also, the code was updated to work with the new ftrace_event_file,
> etc, renaming in tracing/for-next.
>
> Changes from v4:
>
> This version addresses some problems and suggestions made by Daniel
> Wagner - a lot of the code was reworked to get rid of the distinction
> between keys and values, and as a result, both keys and values can be
> used as sort keys. As suggested, it also allows 'val=' to be absent
> in a trigger command - if no 'val' is specified, hitcount is assumed
> and automatically used as the only val.
>
> The map code was also separated out into a separate file,
> tracing_map.c, allowing it to be reused. It also adds a second tracer
> called function_hist that actually does reuse the code, as an RFC
> patch.
>
> Patch 01/10 [tracing: Update cond flag when enabling or disabling..]
> is a fix for a problem noticed by Daniel and that fixes a problem in
> existing trigger code and should be applied regardless of whether the
> rest of the patchset is merged.
>
> As mentioned, patch 10/10 is an RFC patch implementing a new tracer
> based on the function tracer code. It's a fun little tool and is
> useful for a specific problem I'm working on (and is also a nice test
> of the tracing_map code), but is an RFC because first, I'm not sure it
> would really be of general interest and secondly, it's POC-level
> quality and I'd need to spend more time fixing it up to make it
> upstreamable, but I don't want to waste my time if not.
>
> There are a couple of important bits of functionality that were
> present in v1 but not yet reimplemented in v5.
>
> The first is support for compound keys. Currently, maps can only be
> keyed on a single event field, whereas in v1 they could be keyed on
> multiple keys. With support for compound keys, you can create much
> more interesting output, such as for example per-pid lists of
> syscalls or read counts e.g.:
>
> # echo 'hist:keys=common_pid.execname,id.syscall:vals=hitcount' > \
> /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger
>
> # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist
>
> key: common_pid:bash[3112], id:sys_write vals: count:69
> key: common_pid:bash[3112], id:sys_rt_sigprocmask vals: count:218
>
> key: common_pid:update-notifier[3164], id:sys_poll vals: count:37
> key: common_pid:update-notifier[3164], id:sys_recvfrom vals: count:118
>
> key: common_pid:deja-dup-monito[3194], id:sys_sendto vals: count:1
> key: common_pid:deja-dup-monito[3194], id:sys_read vals: count:4
> key: common_pid:deja-dup-monito[3194], id:sys_poll vals: count:8
> key: common_pid:deja-dup-monito[3194], id:sys_recvmsg vals: count:8
> key: common_pid:deja-dup-monito[3194], id:sys_getegid vals: count:8
>
> key: common_pid:emacs[3275], id:sys_fsync vals: count:1
> key: common_pid:emacs[3275], id:sys_open vals: count:1
> key: common_pid:emacs[3275], id:sys_symlink vals: count:2
> key: common_pid:emacs[3275], id:sys_poll vals: count:23
> key: common_pid:emacs[3275], id:sys_select vals: count:23
> key: common_pid:emacs[3275], id:unknown_syscall vals: count:34
> key: common_pid:emacs[3275], id:sys_ioctl vals: count:60
> key: common_pid:emacs[3275], id:sys_rt_sigprocmask vals: count:116
>
> key: common_pid:cat[3323], id:sys_munmap vals: count:1
> key: common_pid:cat[3323], id:sys_fadvise64 vals: count:1
>
> Related to that is support for sorting on multiple fields. Currently,
> you can sort using only a primary key. Being able to sort on multiple
> or at least a secondary key is indispensible for seeing trends when
> displaying multiple values.
>
> Changes from v3:
>
> v4 fixes the race in tracing_map_insert() noted in v3, where
> map.val.key could be checked even if map.val wasn't yet set. The
> simple fix for that in tracing_map_insert() introduces the possibility
> of duplicates in the map, which though rare, need to be accounted for
> in the output. To address that, duplicate-merging code was added to
> the map-printing code.
>
> It was also pointed out that it didn't seem correct to include
> module.h, but the fix for that has deeper roots and is being addressed
> by a separate patchset; for now we need to continue including
> module.h, though prompted by that I did some other header include
> cleanup.
>
> The functionality remains the same as v2, but this version no longer
> tries to export and use bpf_maps, and more importantly removes the
> associated GFP_NOTRACE/trace event hacks and kmem macros required to
> work around the bpf_map implementation.
>
> The tracing_map functionality is instead built on top of a simple
> lock-free map algorithm originated by Dr. Cliff Click (see references
> in the code for more details), which though too restrictive to be
> general-purpose in its current form, functions nicely as a
> special-purpose tracing map.
>
> v3 also moves the hist triggers code into a separate file and puts it
> all behind a new config option, CONFIG_HIST_TRIGGERS. It also merges
> in the sorting code rather than keeping it as a separate patch.
>
> This patchset also includes a couple other new and related triggers,
> enable_hist and disable_hist, very similar to the existing
> enable_event/disable_event triggers used to automatically enable and
> disable events based on a triggering condition, but in this case
> allowing hist triggers to be enabled and disabled in the same way.
>
> - Added an insert check for val before checking the key associated with val
> - Added code to merge possible duplicates in the map
>
> Changes from v2:
> - reimplemented tracing_map, replacing bpf_map with nmi-safe/lock-free map
> - removed GPF_NOTRACE, kmalloc/free macros and event hacks needed by bpf_maps
> - moved hist triggers from trace_events_trigger.c to trace_events_hist.c
> - added CONFIG_HIST_TRIGGERS config option
> - consolidated sorting code with main patch
>
> Changes from v1:
> - completely rewritten on top of tracing_map (renamed and exported bpf_map)
> - added map clearing and client ops to tracing_map
> - changed the name from 'hash' triggers to 'hist' triggers
> - added new trigger 'pause' feature
> - added new enable_hist and disable_hist triggers
> - added usage for hist/enable_hist/disable hist to tracing/README
> - moved examples into Documentation/trace/event.txt
> - added ___GFP_NOTRACE, kmalloc/kfree macros, and conditional kmem tracepoints
>
> The following changes since commit e428abbbf616cd8fdd1162e4a624ad1d47b47544:
>
> tracing: #ifdef out uses of max trace when CONFIG_TRACER_MAX_TRACE is not set (2015-11-10 10:16:05 -0500)
>
> are available in the git repository at:
>
> git://git.yoctoproject.org/linux-yocto-contrib.git tzanussi/hist-triggers-v12
> http://git.yoctoproject.org/cgit/cgit.cgi/linux-yocto-contrib/log/?h=tzanussi/hist-triggers-v12
>
> Masami Hiramatsu (2):
> kselftests/ftrace : Add event trigger testcases
> kselftests/ftrace: Add hist trigger testcases
>
> Namhyung Kim (2):
> tracing: Support string type key properly
> tracing: Add hist trigger 'log2' modifier
>
> Tom Zanussi (26):
> tracing: Update cond flag when enabling or disabling a trigger
> tracing: Make ftrace_event_field checking functions available
> tracing: Make event trigger functions available
> tracing: Add event record param to trigger_ops.func()
> tracing: Add get_syscall_name()
> tracing: Add a per-event-trigger 'paused' field
> tracing: Add needs_rec flag to event triggers
> tracing: Add an unreg_all() callback to trigger commands
> tracing: Add lock-free tracing_map
> tracing: Add 'hist' event trigger command
> tracing: Add hist trigger support for multiple values ('vals=' param)
> tracing: Add hist trigger support for compound keys
> tracing: Add hist trigger support for user-defined sorting ('sort='
> param)
> tracing: Add hist trigger support for pausing and continuing a trace
> tracing: Add hist trigger support for clearing a trace
> tracing: Add hist trigger 'hex' modifier for displaying numeric fields
> tracing: Add hist trigger 'sym' and 'sym-offset' modifiers
> tracing: Add hist trigger 'execname' modifier
> tracing: Add hist trigger 'syscall' modifier
> tracing: Add hist trigger support for stacktraces as keys
> tracing: Remove restriction on string position in hist trigger keys
> tracing: Add enable_hist/disable_hist triggers
> tracing: Add 'hist' trigger Documentation
> tracing: Add support for multiple hist triggers per event
> tracing: Add support for named triggers
> tracing: Add support for named hist triggers
>
> Documentation/trace/events.txt | 1555 ++++++++++++++++++
> include/linux/trace_events.h | 9 +-
> kernel/trace/Kconfig | 27 +
> kernel/trace/Makefile | 2 +
> kernel/trace/trace.c | 57 +
> kernel/trace/trace.h | 106 +-
> kernel/trace/trace_events.c | 4 +
> kernel/trace/trace_events_filter.c | 12 -
> kernel/trace/trace_events_hist.c | 1731 ++++++++++++++++++++
> kernel/trace/trace_events_trigger.c | 309 +++-
> kernel/trace/trace_syscalls.c | 11 +
> kernel/trace/tracing_map.c | 1058 ++++++++++++
> kernel/trace/tracing_map.h | 282 ++++
> tools/testing/selftests/ftrace/test.d/functions | 9 +
> .../ftrace/test.d/trigger/trigger-eventonoff.tc | 64 +
> .../ftrace/test.d/trigger/trigger-filter.tc | 59 +
> .../ftrace/test.d/trigger/trigger-hist-mod.tc | 65 +
> .../ftrace/test.d/trigger/trigger-hist.tc | 83 +
> .../ftrace/test.d/trigger/trigger-multihist.tc | 73 +
> .../ftrace/test.d/trigger/trigger-snapshot.tc | 56 +
> .../ftrace/test.d/trigger/trigger-stacktrace.tc | 53 +
> .../ftrace/test.d/trigger/trigger-traceonoff.tc | 58 +
> 22 files changed, 5596 insertions(+), 87 deletions(-)
> create mode 100644 kernel/trace/trace_events_hist.c
> create mode 100644 kernel/trace/tracing_map.c
> create mode 100644 kernel/trace/tracing_map.h
> create mode 100644 tools/testing/selftests/ftrace/test.d/trigger/trigger-eventonoff.tc
> create mode 100644 tools/testing/selftests/ftrace/test.d/trigger/trigger-filter.tc
> create mode 100644 tools/testing/selftests/ftrace/test.d/trigger/trigger-hist-mod.tc
> create mode 100644 tools/testing/selftests/ftrace/test.d/trigger/trigger-hist.tc
> create mode 100644 tools/testing/selftests/ftrace/test.d/trigger/trigger-multihist.tc
> create mode 100644 tools/testing/selftests/ftrace/test.d/trigger/trigger-snapshot.tc
> create mode 100644 tools/testing/selftests/ftrace/test.d/trigger/trigger-stacktrace.tc
> create mode 100644 tools/testing/selftests/ftrace/test.d/trigger/trigger-traceonoff.tc
>
> --
> 1.9.3
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/