Re: [PATCH v2 linux-trace 1/8] tracing: attach eBPF programs to tracepoints and syscalls

From: Alexei Starovoitov
Date: Wed Jan 28 2015 - 23:40:59 EST


On Wed, Jan 28, 2015 at 4:46 PM, Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
>>
>> +static int event_filter_release(struct inode *inode, struct file *filp)
>> +{
>> + struct ftrace_event_file *file;
>> + char buf[2] = "0";
>> +
>> + mutex_lock(&event_mutex);
>> + file = event_file_data(filp);
>> + if (file) {
>> + if (file->flags & TRACE_EVENT_FL_BPF) {
>> + /* auto-disable the filter */
>> + ftrace_event_enable_disable(file, 0);
>
> Hmm.. what if user already enabled an event, attached a bpf filter and
> then detached the filter - I'm not sure we can always auto-disable
> it..

why not?
I think it makes sense auto enable/disable, since that
is cleaner user experience.
Otherwise Ctrl-C of the user process will have bpf program dangling.
not good. If we auto-unload bpf program only, it's equally bad.
Since Ctrl-C of the process will auto-onload only
and will keep tracepoint enabled which will be spamming
the trace buffer.

>> +unsigned int trace_filter_call_bpf(struct event_filter *filter, void *ctx)
>> +{
>> + unsigned int ret;
>> +
>> + if (in_nmi()) /* not supported yet */
>> + return 0;
>
> But doesn't this mean to auto-disable all attached events during NMI
> as returning 0 will prevent the event going to ring buffer?

well, it means that if tracepoint fired during nmi the program
won't be called and event won't be sent to trace buffer.
The program might be broken (like divide by zero) and
it will self-terminate with 'return 0'
so zero should be the safest return value that
causes minimum disturbance to the whole system overall.

> I think it'd be better to keep an attached event in a soft-disabled
> state like event trigger and give control of enabling to users..

I think it suffers from the same Ctrl-C issue.
Say, attaching bpf program activates tracepoint and keeps
it in soft-disabled. Then user space clears soft-disabled.
Then user Ctrl-C it. Now bpf program must auto-detach
and unload, since prog_fd is closing.
If we don't completely deactivate tracepoint, then
Ctrl-C will leave the state of the system in the state
different from it was before user process started running.
I think we must avoid such situation.
'kill pid' should be completely cleaning all resources
that user process was using.
Yes. It's different from typical usage of /sys/.../tracing
that has all global knobs, but, imo, it's cleaner this way.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/