Re: [PATCH 1/2] tracing/ftrace: don't trace on early stage ofsecondary cpu boot

From: Steven Rostedt
Date: Tue Dec 23 2008 - 21:19:29 EST



On Wed, 24 Dec 2008, Frederic Weisbecker wrote:

> Impact: fix a crash/hard reboot while enabling cpu on runtime
>
> On some archs, the boot of a secondary cpu can have an early fragile state.
> On x86-64, the pda is not initialized on the first stage of a cpu boot but
> it is needed to get the cpu number and the current task pointer. These datas
> are needed during tracing. As they were dereferenced at this stage, we got a
> crash while turning on a cpu on runtime while tracing.
>
> Some other archs like ia64 can have such kind of issue too.
>
> This patch implements a function to verify if the cpu has been correctly initialized
> before tracing it. By default, the check does nothing but archs can override it (as does
> x86-64 in this patch).
>
> Signed-off-by: Frederic Weisbecker <fweisbec@xxxxxxxxx>
> ---
> arch/x86/kernel/ftrace.c | 13 +++++++++++++
> include/linux/ftrace.h | 8 ++++++++
> kernel/trace/ftrace.c | 6 ++++++
> kernel/trace/trace.c | 6 +++---
> 4 files changed, 30 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
> index 1b43086..7693c53 100644
> --- a/arch/x86/kernel/ftrace.c
> +++ b/arch/x86/kernel/ftrace.c
> @@ -24,6 +24,19 @@
> #include <asm/nmi.h>
>
>
> +#ifdef CONFIG_X86_64
> +/*
> + * A cpu doesn't want to be traced if pda hasn't yet been initialized.
> + * Pda contains the cpu number and current task pointer. A lot
> + * of functions use them, not only ftrace.
> + * We use stack_smp_processor_id() here to avoid using the pda.
> + */
> +bool ftrace_in_early_cpuinit(void)
> +{
> + return !cpu_isset(stack_smp_processor_id(), cpu_online_map);
> +}
> +#endif

Is there something lighter weight than this? This is called during the
function tracer and that is called at ever function. I would also rather
this be an inline function defined in asm/ftrace.h if possible.

> +
> #ifdef CONFIG_DYNAMIC_FTRACE
>
> union ftrace_code_union {
> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index 677432b..9a86e3d 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -78,6 +78,14 @@ void clear_ftrace_function(void);
>
> extern void ftrace_stub(unsigned long a0, unsigned long a1);
>
> +/*
> + * On some archs, the secondary cpu boot is sometimes fragile.
> + * Some important datas could be not yet initialized. In such states,
> + * tracing is too dangerous and we use this function to detect these
> + * kinds of context. Archs which have this sort of issues can override it.
> + */
> +extern bool ftrace_in_early_cpuinit(void);
> +
> #else /* !CONFIG_FUNCTION_TRACER */
> # define register_ftrace_function(ops) do { } while (0)
> # define unregister_ftrace_function(ops) do { } while (0)
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 2f32969..463e1eb 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -1962,6 +1962,12 @@ ftrace_enable_sysctl(struct ctl_table *table, int write,
> return ret;
> }
>
> +/* Overriden by archs if needed */
> +bool __attribute__((weak)) ftrace_in_early_cpuinit(void)
> +{
> + return false;
> +}
> +
> #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>
> static atomic_t ftrace_graph_active;
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index b07e34e..1ac9b98 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -1175,7 +1175,7 @@ function_trace_call_preempt_only(unsigned long ip, unsigned long parent_ip)
> int cpu, resched;
> int pc;
>
> - if (unlikely(!ftrace_function_enabled))
> + if (unlikely(!ftrace_function_enabled || ftrace_in_early_cpuinit()))
> return;
>
> pc = preempt_count();
> @@ -1202,7 +1202,7 @@ function_trace_call(unsigned long ip, unsigned long parent_ip)
> int cpu;
> int pc;
>
> - if (unlikely(!ftrace_function_enabled))
> + if (unlikely(!ftrace_function_enabled || ftrace_in_early_cpuinit()))
> return;
>
> /*
> @@ -1233,7 +1233,7 @@ int trace_graph_entry(struct ftrace_graph_ent *trace)
> int cpu;
> int pc;
>
> - if (!ftrace_trace_task(current))
> + if (!ftrace_trace_task(current) || ftrace_in_early_cpuinit())
> return 0;

Actually, didn't you say current is not available either? We are testing
current first.

I still wonder if there's a better way to find out if it is safe to run.
Perhaps the arch code can test the %gs register to see if it is actually
something valid?

-- Steve


>
> if (!ftrace_graph_addr(trace->func))
> --
> 1.6.0.4
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/