Re: [PATCH] rcu: fix crash when reading rcudata debugfs node

From: Paul E. McKenney
Date: Thu Jun 29 2017 - 13:07:34 EST


On Thu, Jun 29, 2017 at 11:25:08AM -0500, Leon Yang wrote:
> With CONFIG_RCU_TRACE=y set, reading the rcu/<rcu_flavor>/rcudata file
> in debugfs will leads to a segmentation fault.
>
> dmesg reports general protection fault with the following message:
>
> task: ffffa0e715959d00 task.stack: ffffc117c7390000
> RIP: 0010:show_rcudata+0x4e/0x1e0
> RSP: 0018:ffffc117c7393d50 EFLAGS: 00010282
> RAX: ffffa0e8d7212098 RBX: ffffa0e8d721cd00 RCX: ffffa0e8d7200000
> RDX: 0000000000000000 RSI: 0000000000000e6c RDI: 0000000000000000
> RBP: ffffc117c7393d90 R08: 0000000000000000 R09: 0000000000000fff
> R10: ffffa0e8bf464000 R11: ffffa0e715959d00 R12: ffffa0e81ecf4100
> R13: ffffa0e8c51fcd00 R14: ffffa0e8d721cd00 R15: ffffa0e81ecf4100
> FS: 00007fea5a064700(0000) GS:ffffa0e8d7440000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fea5a041008 CR3: 00000006035f1000 CR4: 00000000000406e0
> Call Trace:
> seq_read+0x9e/0x440
> ? lru_cache_add_active_or_unevictable+0x36/0xb0
> full_proxy_read+0x54/0x90
> __vfs_read+0x37/0x150
> ? security_file_permission+0x9b/0xc0
> vfs_read+0x93/0x130
> SyS_read+0x55/0xc0
> entry_SYSCALL_64_fastpath+0x1e/0xa9
> RIP: 0033:0x7fea59b89230
> RSP: 002b:00007ffca86f0418 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
> RAX: ffffffffffffffda RBX: 00007fea59e56b20 RCX: 00007fea59b89230
> RDX: 0000000000020000 RSI: 00007fea5a042000 RDI: 0000000000000003
> RBP: 0000000000021010 R08: ffffffffffffffff R09: 0000000000000000
> R10: 000000000000037b R11: 0000000000000246 R12: 0000000000022000
> R13: 00007fea59e56b78 R14: 0000000000001000 R15: 0000000000020000
> Code: f3 48 83 ec 18 48 63 8e 70 01 00 00 48 8b 86 c8 00 00 00 44 0f
> b6 46 1a 48 8b 76 10 48 83 c0 18 48 89 ca 48 8b 0c cd e0 f3 54 90 <48>
> 39 34 01 0f b6 73 18 89 d0 40 0f 94 c7 48 0f a3 05 5c 0f e8
>
> In print_one_rcu_data(), the argument rdp is already a per_cpu pointer
> (see r_start():66). Therefore, calling per_cpu() on
> rdp->dynticks->rcu_qs_ctr will dereference invalid memory address.
> Since rdp->dynticks points to the per_cpu struct rcu_dynticks, using
> rdp->dynticks->rcu_qs_ctr directly will fix this problem.
>
> Fixes: 9577df9a3122af08fff84b8a1a60dccf524a3891 ("rcu: Pull rcu_qs_ctr
> into rcu_dynticks structure")
> Signed-off-by: Leon Yang <leon.gh.yang@xxxxxxxxx>

Good catch!

However, I have removed RCU's debugfs functionality because I have
since added event tracing. It has been some years since I have used
RCU's debugfs. Which might explain its bugginess.

If you tell me what you were wanting to use the debugfs for, I would be
happy to tell you how to some reasonable facsimile of that information
from event tracing.

Thanx, Paul

> ---
> --- kernel/rcu/tree_trace.c.orig
> +++ kernel/rcu/tree_trace.c
> @@ -121,7 +121,7 @@ static void print_one_rcu_data(struct se
> cpu_is_offline(rdp->cpu) ? '!' : ' ',
> ulong2long(rdp->completed), ulong2long(rdp->gpnum),
> rdp->cpu_no_qs.b.norm,
> - rdp->rcu_qs_ctr_snap == per_cpu(rdp->dynticks->rcu_qs_ctr, rdp->cpu),
> + rdp->rcu_qs_ctr_snap == rdp->dynticks->rcu_qs_ctr,
> rdp->core_needs_qs);
> seq_printf(m, " dt=%d/%llx/%d df=%lu",
> rcu_dynticks_snap(rdp->dynticks),
>