Re: tracing ring_buffer_resize oops.

From: Andi Kleen
Date: Thu May 24 2012 - 20:14:36 EST


On Thu, May 24, 2012 at 07:40:16PM -0400, Steven Rostedt wrote:
> On Thu, 2012-05-24 at 13:22 -0400, Dave Jones wrote:
>
> I found a clue!
>
>
> > [ 1013.243754] BUG: unable to handle kernel NULL pointer dereference at 0000000000000002
> > [ 1013.272665] IP: [<ffff880145cc0000>] 0xffff880145cbffff
> > [ 1013.285186] PGD 1401b2067 PUD 14324c067 PMD 0
> > [ 1013.298832] Oops: 0010 [#1] PREEMPT SMP
> > [ 1013.310600] CPU 2
> > [ 1013.317904] Modules linked in: ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables crc32c_intel ghash_clmulni_intel microcode usb_debug serio_raw pcspkr iTCO_wdt i2c_i801 iTCO_vendor_support e1000e nfsd nfs_acl auth_rpcgss lockd sunrpc i915 video i2c_algo_bit drm_kms_helper drm i2c_core [last unloaded: scsi_wait_scan]
> > [ 1013.401848]
> > [ 1013.407399] Pid: 112, comm: kworker/2:1 Not tainted 3.4.0+ #30
> > [ 1013.437943] RIP: 8eb8:[<ffff88014630a000>] [<ffff88014630a000>] 0xffff880146309fff
>
> RIP is always near the GS segment. As GS points to the per_cpu area, we
> may somehow be getting our GS screwed up. I'm not sure why that would
> affect the RIP. Maybe stacks are not being processed properly somewhere?
>
> It's strange because I can either trigger it on the first try, or it
> never triggers at all??

I think this could happen if you get your SWAPGS state screwed up
(so you do a mismatched swapgs) In the early days of the port I fought a
lot with this.

One easy way to debug it is to read the GS msr early and double
check it's as expected.

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/