Re: perf: aux area related crash and warnings

From: Vince Weaver
Date: Fri Jun 12 2015 - 14:36:55 EST


On Thu, 11 Jun 2015, Vince Weaver wrote:

> and while I was trying to cut and paste that, the locked haswell just took
> down the network switch so I can't get the rest until tomorrow.

here are the full dumps if anyone is interested

the warning is reproducible, the spinlock disaster is not.

[36298.986117] BUG: spinlock recursion on CPU#4, perf_fuzzer/3410
[36298.992915] lock: 0xffff88011edf7cd0, .magic: dead4ead, .owner: perf_fuzzer/3410, .owner_cpu: 4
[36299.002919] CPU: 4 PID: 3410 Comm: perf_fuzzer Tainted: G W 4.1.0-rc7+ #155
[36299.012152] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
[36299.020606] ffff88011edf7cd0 ffff88011eb059a0 ffffffff816d7229 0000000000000054
[36299.029199] ffff8800c2f4ac50 ffff88011eb059c0 ffffffff810c2895 ffff88011edf7cd0
[36299.037796] ffffffff81a1e481 ffff88011eb059e0 ffffffff810c2916 ffff88011edf7cd0
[36299.046338] Call Trace:
[36299.049501] <NMI> [<ffffffff816d7229>] dump_stack+0x45/0x57
[36299.056284] [<ffffffff810c2895>] spin_dump+0x85/0xe0
[36299.062282] [<ffffffff810c2916>] spin_bug+0x26/0x30
[36299.068111] [<ffffffff810c2acf>] do_raw_spin_lock+0x13f/0x180
[36299.074897] [<ffffffff816de6e9>] _raw_spin_lock+0x39/0x40
[36299.081276] [<ffffffff8117a039>] ? free_pcppages_bulk+0x39/0x620
[36299.088340] [<ffffffff8117a039>] free_pcppages_bulk+0x39/0x620
[36299.095182] [<ffffffff81177e14>] ? free_pages_prepare+0x3a4/0x550
[36299.102291] [<ffffffff811c9936>] ? kfree_debugcheck+0x16/0x40
[36299.108987] [<ffffffff8117a938>] free_hot_cold_page+0x178/0x1a0
[36299.115850] [<ffffffff8117aa47>] __free_pages+0x37/0x50
[36299.121991] [<ffffffff8116ae0a>] rb_free_aux+0xba/0xf0
[36299.128034] [<ffffffff8116b0e7>] perf_aux_output_end+0xb7/0xf0
[36299.134793] [<ffffffff81037b0e>] intel_bts_interrupt+0x8e/0xd0
[36299.141543] [<ffffffff810338bf>] intel_pmu_handle_irq+0x4f/0x450
[36299.148482] [<ffffffff810bc288>] ? check_chain_key+0x128/0x1e0
[36299.155249] [<ffffffff8102a4ab>] perf_event_nmi_handler+0x2b/0x50
[36299.162273] [<ffffffff810185d0>] nmi_handle+0xa0/0x150
[36299.168278] [<ffffffff81018535>] ? nmi_handle+0x5/0x150
[36299.174377] [<ffffffff8101887a>] default_do_nmi+0x4a/0x140
[36299.180735] [<ffffffff81018a08>] do_nmi+0x98/0xe0
[36299.186219] [<ffffffff816e13ef>] end_repeat_nmi+0x1e/0x2e
[36299.192501] [<ffffffff810bdc4e>] ? __lock_acquire.isra.31+0x27e/0x1000
[36299.199951] [<ffffffff810bdc4e>] ? __lock_acquire.isra.31+0x27e/0x1000
[36299.207410] [<ffffffff810bdc4e>] ? __lock_acquire.isra.31+0x27e/0x1000
[36299.214898] <<EOE>> [<ffffffff810bdd89>] ? __lock_acquire.isra.31+0x3b9/0x1000
[36299.223222] [<ffffffff810bf365>] lock_acquire+0xa5/0x130
[36299.229409] [<ffffffff8117b379>] ? __rmqueue+0x109/0x510
[36299.235551] [<ffffffff8117b379>] ? __rmqueue+0x109/0x510
[36299.241720] [<ffffffff8117810b>] bad_range+0x6b/0x150
[36299.247612] [<ffffffff8117b379>] ? __rmqueue+0x109/0x510
[36299.253757] [<ffffffff8116a8c5>] ? perf_output_begin+0x5/0x270
[36299.260489] [<ffffffff8117b379>] __rmqueue+0x109/0x510
[36299.266501] [<ffffffff8117bf63>] get_page_from_freelist+0x7e3/0xa50
[36299.273680] [<ffffffff8117c4cb>] __alloc_pages_nodemask+0x1fb/0xa90
[36299.280872] [<ffffffff811c416b>] ? alloc_pages_vma+0x10b/0x290
[36299.287586] [<ffffffff810bdd89>] ? __lock_acquire.isra.31+0x3b9/0x1000
[36299.295017] [<ffffffff811c416b>] alloc_pages_vma+0x10b/0x290
[36299.301544] [<ffffffff8119f1a4>] ? wp_page_copy.isra.51+0x84/0x4f0
[36299.308613] [<ffffffff8119f1a4>] wp_page_copy.isra.51+0x84/0x4f0
[36299.315506] [<ffffffff811a0d94>] ? do_wp_page+0x84/0x6a0
[36299.321633] [<ffffffff811a0dad>] do_wp_page+0x9d/0x6a0
[36299.327533] [<ffffffff811a2b69>] ? handle_mm_fault+0x309/0x1770
[36299.334245] [<ffffffff811a364c>] handle_mm_fault+0xdec/0x1770
[36299.340854] [<ffffffff81063034>] __do_page_fault+0x1b4/0x460
[36299.347337] [<ffffffff8106334e>] trace_do_page_fault+0x3e/0xc0
[36299.354021] [<ffffffff816e1068>] trace_page_fault+0x28/0x30
[36315.086708] ------------[ cut here ]------------


and

[27716.785131] WARNING: CPU: 2 PID: 17655 at kernel/events/ring_buffer.c:282 perf_aux_output_begin+0x1ce/0x1f0()
[27716.796157] Modules linked in: fuse x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi coretemp kvm_intel kvm snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic snd_hda_intel snd_hda_controller crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec aesni_intel snd_hda_core aes_x86_64 iTCO_wdt lrw snd_hwdep gf128mul snd_pcm iTCO_vendor_support ppdev i915snd_timer glue_helper evdev ablk_helper snd cryptd mei_me mei psmouse pcspkr soundcore serio_raw lpc_ich drm_kms_helper parport_pc battery tpm_tis mfd_core parport drm video tpm i2c_i801 i2c_algo_bit wmi processor button sg sr_mod cdrom sd_mod ehci_pci ehci_hcd ahci libahci libata xhci_pci xhci_hcd e1000e ptp scsi_mod crc32c_intel usbcore pps_core usb_common thermal fan thermal_sys
[27716.873350] CPU: 2 PID: 17655 Comm: perf_fuzzer Tainted: G W 4.1.0-rc7+ #155
[27716.882671] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
[27716.891168] ffffffff81a230f2 ffff8800cf9afc68 ffffffff816d7229 0000000000000000
[27716.899782] 0000000000000000 ffff8800cf9afca8 ffffffff81072eba 000000000000cf80
[27716.908372] ffff880036d85a00 ffff88011ea8cf80 ffff880119106000 000000000000cf80
[27716.916981] Call Trace:
[27716.920180] [<ffffffff816d7229>] dump_stack+0x45/0x57
[27716.926250] [<ffffffff81072eba>] warn_slowpath_common+0x8a/0xc0
[27716.933244] [<ffffffff81072faa>] warn_slowpath_null+0x1a/0x20
[27716.939999] [<ffffffff8116b00e>] perf_aux_output_begin+0x1ce/0x1f0
[27716.947225] [<ffffffff81164785>] ? perf_event_update_userpage+0x5/0x170
[27716.954875] [<ffffffff8103767d>] bts_event_add+0x6d/0x150
[27716.961259] [<ffffffff81162ddb>] event_sched_in.isra.85+0xfb/0x330
[27716.968495] [<ffffffff811630ca>] group_sched_in+0xba/0x1e0
[27716.974957] [<ffffffff8101df2a>] ? native_sched_clock+0x2a/0x90
[27716.981888] [<ffffffff811633c0>] ctx_sched_in+0x1d0/0x3f0
[27716.988274] [<ffffffff81163649>] perf_event_sched_in+0x69/0xa0
[27716.995082] [<ffffffff81163bc0>] perf_event_context_sched_in+0x90/0x160
[27717.002714] [<ffffffff81164148>] __perf_event_task_sched_in+0xa8/0xc0
[27717.010160] [<ffffffff8109897d>] finish_task_switch+0xed/0x140
[27717.016988] [<ffffffff81012723>] ? __switch_to+0x263/0x5d0
[27717.023400] [<ffffffff8109e5a1>] schedule_tail+0x11/0xa0
[27717.029652] [<ffffffff816df57f>] ret_from_fork+0xf/0x70
[27717.035806] ---[ end trace bdbfeef5d81e5fa4 ]---

which corresponds to
/*
* Nesting is not supported for AUX area, make sure nested
* writers are caught early
*/
if (WARN_ON_ONCE(local_xchg(&rb->aux_nest, 1)))
goto err_put;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/