Re: [PATCH 2/6] tracing: increase size of number of possible events

From: Li Zefan
Date: Mon May 04 2009 - 21:32:21 EST


Steven Rostedt wrote:
> On Mon, 4 May 2009, Li Zefan wrote:
>>>>>>> # cat /debug/tracing/trace_pipe
>>>>>>>
>>>>>>> console 2:
>>>>>>> while (1) {
>>>>>>> insmod trace-events-sample.ko
>>>>>>> echo foo_bar > /debug/tracing/set_event
>>>>>>> rmmod trace-events-sample.ko
>>>>>>> }
>>>>>>>
>>>>>>> I got this immediately:
>>>>>>>
>>>>>>> BUG: unable to handle kernel NULL pointer dereference at 0000006f
>>>>>>> IP: [<c05210f3>] bstr_printf+0x2ce/0x302
>>>>>>> ...
>>>>>>> Call Trace:
>>>>>>> [<c0476d12>] ? trace_seq_bprintf+0x28/0x41
>>>>>>> [<c0477569>] ? trace_bprint_print+0x58/0x6c
>>>>>>> [<c0472ffc>] ? print_trace_line+0x2c5/0x2df
>>>>>>> [<c0428a41>] ? sub_preempt_count+0x85/0xa0
>>>>>>> [<c04758cf>] ? tracing_read_pipe+0x118/0x191
>>>>>>> [<c04757b7>] ? tracing_read_pipe+0x0/0x191
>>>>>>> [<c04b09f9>] ? vfs_read+0x8f/0x136
>>>>>>> [<c04b0da3>] ? sys_read+0x40/0x65
>>>>>>> [<c0402a68>] ? sysenter_do_call+0x12/0x36
>>>>>>>
>>>>>>> (We can even get other crashes..)
>>>>>>>
>>>>>> Can you send me your full bootlog and config. I'm not able to reproduce
>>>>>> this.
>>>>> And the SHA1 of the HEAD of the git repo you are using.
>>>>>
>>>>> Thanks,
>>>>>
>>>> the HEAD: 9ce5424d75e56891905b77d1589924765e62059a + this patch
>>>>
>>>> (I commented out "hi" from the sample module)
>>> I found that commenting out the "hi" too helped in causing races.
>>>
>>>> I think it's because "type" is wrapped back to 0. I changed the test script
>>>> like this:
>>>>
>>>> while (foo_bar.id < 65536)
>>>> {
>>>> insmod trace-events-sample.ko
>>>> rmmod trace-events-sample.ko
>>>> }
>>>>
>>>> for ((; ;))
>>>> {
>>>> insmod /home/lizf/linux-2.6-tip/samples/trace_events/*.ko
>>>> echo trace-events-sample:foo_bar > /mnt/tracing/set_event
>>>> sleep 1
>>>> cat /mnt/tracing/trace
>>>> rmmod /home/lizf/linux-2.6-tip/samples/trace_events/*.ko
>>>> }
>>>>
>>> Could you try my git repo with my latest changes?
>>>
>>> git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git
>>>
>>> branch: rfc/debugfs
>>>
>> The bug won't be triggered with your latest changes, but it has nothing
>> to do with the debugfs patch, it's because this patch:
>> "tracing/events: reuse trace event ids after overflow"
>>
>> But that patch just hides the bug not fixes it.. I thought I explained
>> the bug clearly. :(
>>
>> I think the bug is triggered because a wrong trace_event->trace() is called.
>>
>>
>> Before the above patch:
>> id overflowed
>> =>foo_bar.id == 6 == TRACE_BPRINT
>> =>echo foo_bar > set_event
>> =>sleep N secs
>> =>rmmod sample
>> =>cat trace
>> =>trace_bprint_event->trace() is called!!
>> =>oops!!!
>>
>>
>> After the above patch:
>>
>> Suppose id 0 ~ 65534 are used, only 65535 is left, and we have 2 different
>> sample module trace_event subsystems foo and bar.
>> =>insmod foo.ko, and foo.id == 65535
>> =>echo foo > set_event
>> =>sleep N secs
>> =>rmmod foo.ko
>> =>insmod bar.ko, and bar.id == 65535
>> =>cat trace
>> =>bar->trace() will be called on events which were generated by foo!
>>
>> So is it possible/sane to discard corresponding events from ring buffer when
>> we unload a module?
>>
>> Correct me if I'm wrong..
>
> OK, I understand what you are saying.
>
> You are saying that if we reuse the event id, then there is a case that
> the event can be used by another module, and if we have old data in the
> ring buffer the other module's callback will be called with the previous
> module's data.
>
> Right?
>

You got it. :)

>
> A module should not be storing pointers into the ring buffer. We do
> that with the trace_print, but I think we were suppose to do something
> special if we detect that it is a module being called.
>
> Frederic?
>
> If the module only stores data and never a pointer, then the worst thing
> stale data would do is show junk.
>
> Scanning the ring buffer and discarding the module data can be very racy.
> We would need to disable the ring buffer while doing this. Probably the
> best thing to do is to simpy reset the ring buffers if it detects that a
> module is being unloaded and has callbacks. The data in the ring buffer is
> garbage anyway, so might as well reset it.
>

I agree, if no better way.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/