[PATCHv2] tracing - avoid soft lockup in trace_pipe

From: Jiri Olsa
Date: Mon Mar 14 2011 - 14:06:30 EST


On Mon, Mar 14, 2011 at 11:20:57AM -0400, Steven Rostedt wrote:
> On Mon, 2011-03-14 at 16:07 +0100, Jiri Olsa wrote:
> > So we get back the the while loop with no data added and with
> > TRACE_TYPE_HANDLED return code, which will keep us in the loop.
> >
> > Not sure this is bug or feature :)
> >
> >
> Most bugs are features ;)
>
> honestly, I'd love to rip out all the "bin", "hex" etc formats as they
> are pretty much deprecated.
>
> But as you pointed out that this can keep us in the inner loop, we need
> to fix that. I prefer to fix it within the nop print function. Perhaps
> we need to fill in something. "type: %d", iter->ent->type ?

sounds ok :) attaching v2 patch

thanks,
jirka


---
running following commands:

# enable the binary option
echo 1 > ./options/bin
# disable context info option
echo 0 > ./options/context-info
# tracing only events
echo 1 > ./events/enable
cat trace_pipe

is causing lockup (in NON preemptive kernels) inside
tracing_read_pipe function.

The reason are:
- bin/hex/raw output functions for events are set to
trace_nop_print function, which prints nothing and
returns TRACE_TYPE_HANDLED value
- LOST EVENT trace do not handle trace_seq overflow

These reasons force the while loop in tracing_read_pipe
function never to break.

The attached patch fixies handling of lost event trace, and
changes trace_nop_print to print minimal info, which is needed
for the correct tracing_read_pipe processing.

v2 changes:
- omit the cond_resched changes by trace_nop_print changes
- WARN changed to WARN_ONCE and added info to be able
to find out the culprit

wbr,
jirka


Signed-off-by: Jiri Olsa <jolsa@xxxxxxxxxx>
---
kernel/trace/trace.c | 15 ++++++++++++---
kernel/trace/trace_output.c | 3 +++
2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 9541c27..5af42f4 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -2013,9 +2013,10 @@ enum print_line_t print_trace_line(struct trace_iterator *iter)
{
enum print_line_t ret;

- if (iter->lost_events)
- trace_seq_printf(&iter->seq, "CPU:%d [LOST %lu EVENTS]\n",
- iter->cpu, iter->lost_events);
+ if (iter->lost_events &&
+ !trace_seq_printf(&iter->seq, "CPU:%d [LOST %lu EVENTS]\n",
+ iter->cpu, iter->lost_events))
+ return TRACE_TYPE_PARTIAL_LINE;

if (iter->trace && iter->trace->print_line) {
ret = iter->trace->print_line(iter);
@@ -3229,6 +3230,14 @@ waitagain:

if (iter->seq.len >= cnt)
break;
+
+ /*
+ * Setting the full flag means we reached the trace_seq buffer
+ * size and we should leave by partial output condition above.
+ * One of the trace_seq_* functions is not used properly.
+ */
+ WARN_ONCE(iter->seq.full, "full flag set for trace type %d",
+ iter->ent->type);
}
trace_access_unlock(iter->cpu_file);
trace_event_read_unlock();
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index 456be90..cf535cc 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -830,6 +830,9 @@ EXPORT_SYMBOL_GPL(unregister_ftrace_event);
enum print_line_t trace_nop_print(struct trace_iterator *iter, int flags,
struct trace_event *event)
{
+ if (!trace_seq_printf(&iter->seq, "type: %d\n", iter->ent->type))
+ return TRACE_TYPE_PARTIAL_LINE;
+
return TRACE_TYPE_HANDLED;
}

--
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/