Re: [PATCH 3/3] tracing/mm: Don't trace mm_page_pcpu_drain on offline cpus

From: Preeti Murthy
Date: Wed Apr 29 2015 - 05:14:03 EST


Ccing Paul,

On Tue, Apr 28, 2015 at 9:21 PM, Shreyas B. Prabhu
<shreyas@xxxxxxxxxxxxxxxxxx> wrote:
> Since tracepoints use RCU for protection, they must not be called on
> offline cpus. trace_mm_page_pcpu_drain can be called on an offline cpu
> in this scenario caught by LOCKDEP:
>
> ===============================
> [ INFO: suspicious RCU usage. ]
> 4.1.0-rc1+ #9 Not tainted
> -------------------------------
> include/trace/events/kmem.h:265 suspicious rcu_dereference_check() usage!
>
> other info that might help us debug this:
>
> RCU used illegally from offline CPU!
> rcu_scheduler_active = 1, debug_locks = 1
> 1 lock held by swapper/5/0:
> #0: (&(&zone->lock)->rlock){..-...}, at: [<c0000000002073b0>] .free_pcppages_bulk+0x70/0x920
>
> stack backtrace:
> CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.1.0-rc1+ #9
> Call Trace:
> [c000001fed2e7720] [c0000000009dee8c] .dump_stack+0x98/0xd4 (unreliable)
> [c000001fed2e77a0] [c000000000128d88] .lockdep_rcu_suspicious+0x108/0x170
> [c000001fed2e7830] [c00000000020794c] .free_pcppages_bulk+0x60c/0x920
> [c000001fed2e7980] [c000000000208188] .free_hot_cold_page+0x208/0x280
> [c000001fed2e7a30] [c00000000004d000] .destroy_context+0x90/0xd0
> [c000001fed2e7ab0] [c0000000000bd1d8] .__mmdrop+0x58/0x160
> [c000001fed2e7b40] [c0000000001068e0] .idle_task_exit+0xf0/0x100
> [c000001fed2e7bc0] [c000000000066948] .pnv_smp_cpu_kill_self+0x58/0x2c0
> [c000001fed2e7ca0] [c00000000003ce34] .cpu_die+0x34/0x50
> [c000001fed2e7d10] [c0000000000176d0] .arch_cpu_idle_dead+0x20/0x40
> [c000001fed2e7d80] [c00000000011f9a8] .cpu_startup_entry+0x708/0x7a0
> [c000001fed2e7ec0] [c00000000003cb6c] .start_secondary+0x36c/0x3a0
> [c000001fed2e7f90] [c000000000008b6c] start_secondary_prolog+0x10/0x14
>
> Fix this by converting mm_page_pcpu_drain trace point into TRACE_EVENT_CONDITION
> where condition is cpu_online(smp_processor_id())
>
> Signed-off-by: Shreyas B. Prabhu <shreyas@xxxxxxxxxxxxxxxxxx>
> ---
> include/trace/events/kmem.h | 16 +++++++++++++++-
> 1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/include/trace/events/kmem.h b/include/trace/events/kmem.h
> index 4abda92..6cd975f 100644
> --- a/include/trace/events/kmem.h
> +++ b/include/trace/events/kmem.h
> @@ -257,12 +257,26 @@ DEFINE_EVENT(mm_page, mm_page_alloc_zone_locked,
> TP_ARGS(page, order, migratetype)
> );
>
> -DEFINE_EVENT_PRINT(mm_page, mm_page_pcpu_drain,
> +TRACE_EVENT_CONDITION(mm_page_pcpu_drain,
>
> TP_PROTO(struct page *page, unsigned int order, int migratetype),
>
> TP_ARGS(page, order, migratetype),
>
> + TP_CONDITION(cpu_online(smp_processor_id())),
> +
> + TP_STRUCT__entry(
> + __field( unsigned long, pfn )
> + __field( unsigned int, order )
> + __field( int, migratetype )
> + ),
> +
> + TP_fast_assign(
> + __entry->pfn = page ? page_to_pfn(page) : -1UL;
> + __entry->order = order;
> + __entry->migratetype = migratetype;
> + ),
> +

What was the need to do the above changes besides adding TP_CONDITION ?

Regards
Preeti U Murthy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/