perf: fuzzer triggered trouble on AMD, maybe ibs related

From: Vince Weaver
Date: Thu Oct 22 2015 - 12:45:37 EST


Hello

I've been busy but finally had a chance to run perf_fuzzer on current git.
I am running on an AMD A10 system (my traditional Haswell system is
otherwise occupied).

I got the following WARNING which was followed by an NMI storm which
eventually managed to confuse ext4 enough that my / partition was
remounted read-only? Very alarming.

This is in static void perf_ibs_start(struct perf_event *event, int flags)

if (WARN_ON_ONCE(!(hwc->state & PERF_HES_STOPPED)))
return;

[ 359.629045] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/perf_event_amd_ibs.c:372 perf_ibs_start+0x43/0x131()
[ 359.639091] Modules linked in: nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc nls_utf8 nls_cp437 vfat fat snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi kvm_amd kvm sha256_generic hmac drbg ansi_cprng aesni_intel aes_x86_64 snd_hda_intel ablk_helper cryptd snd_hda_codec lrw snd_hda_core gf128mul glue_helper ppdev snd_hwdep hp_wmi snd_pcm evdev sparse_keymap snd_timer pl2303 radeon ttm drm_kms_helper tpm_infineon pcspkr drm efivars psmouse serio_raw i2c_piix4 i2c_algo_bit usbserial fb_sys_fops shpchp k10temp parport_pc snd syscopyarea i2c_core parport soundcore tpm_tis wmi sysfillrect button tpm sysimgblt acpi_cpufreq processor sg sr_mod cdrom sd_mod ohci_pci ahci libahci tg3 xhci_pci ptp pps_core libata xhci_hcd ohci_hcd ehci_pci libphy ehci_hcd crc32c_intel
[ 359.711502] scsi_mod usbcore usb_common
[ 359.714203] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 4.3.0-rc6+ #12
[ 359.721804] Hardware name: Hewlett-Packard HP Compaq Pro 6305 SFF/1850, BIOS K06 v02.57 08/16/2013
[ 359.730808] 0000000000000006 ffffffff8123e6b7 0000000000000000 ffffffff8104519a
[ 359.738322] ffffffff8102a003 ffff880224098c00 ffffe8ffffc036d0 ffffffff81824ec0
[ 359.745832] ffff88022ec0f8e0 ffffffff8102a003 ffff880224098c00 ffffe8ffffc06a70
[ 359.753328] Call Trace:
[ 359.755793] <IRQ> [<ffffffff8123e6b7>] ? dump_stack+0x40/0x50
[ 359.761762] [<ffffffff8104519a>] ? warn_slowpath_common+0x94/0xa9
[ 359.767963] [<ffffffff8102a003>] ? perf_ibs_start+0x43/0x131
[ 359.773730] [<ffffffff8102a003>] ? perf_ibs_start+0x43/0x131
[ 359.779495] [<ffffffff810d8842>] ? perf_event_task_tick+0x101/0x1b5
[ 359.785874] [<ffffffff8109476c>] ? tick_sched_do_timer+0x24/0x24
[ 359.791990] [<ffffffff81063628>] ? scheduler_tick+0x64/0x7d
[ 359.797673] [<ffffffff810896fd>] ? update_process_times+0x3b/0x45
[ 359.803876] [<ffffffff810942d3>] ? tick_sched_handle+0x3e/0x4a
[ 359.809820] [<ffffffff8109479b>] ? tick_sched_timer+0x2f/0x53
[ 359.815676] [<ffffffff81089f55>] ? __hrtimer_run_queues+0xb9/0x18b
[ 359.821967] [<ffffffff8108a1e8>] ? hrtimer_interrupt+0x61/0x101
[ 359.827995] [<ffffffff8102d417>] ? smp_apic_timer_interrupt+0x20/0x2f
[ 359.834549] [<ffffffff8141e58f>] ? apic_timer_interrupt+0x7f/0x90
[ 359.840745] <EOI> [<ffffffff8133f769>] ? cpuidle_enter_state+0xf3/0x145
[ 359.847579] [<ffffffff8106ebab>] ? cpu_startup_entry+0x170/0x1db
[ 359.853694] [<ffffffff818eddfd>] ? start_kernel+0x40b/0x413
[ 359.859371] ---[ end trace 93964ed985254224 ]---
[ 360.468852] Uhhuh. NMI received for unknown reason 2d on CPU 2.
[ 360.474790] Do you have a strange power saving mode enabled?
[ 360.480454] Dazed and confused, but trying to continue
[ 360.695032] Uhhuh. NMI received for unknown reason 2d on CPU 1.
[ 360.700985] Do you have a strange power saving mode enabled?
[ 360.706666] Dazed and confused, but trying to continue
[ 361.739498] Uhhuh. NMI received for unknown reason 3d on CPU 0.
[ 361.745438] Do you have a strange power saving mode enabled?
[ 361.751104] Dazed and confused, but trying to continue
[ 361.828053] Uhhuh. NMI received for unknown reason 3d on CPU 0.
[ 361.833989] Do you have a strange power saving mode enabled?
[ 361.839677] Dazed and confused, but trying to continue

.....

[ 468.763231] Dazed and confused, but trying to continue
[ 468.794184] Uhhuh. NMI received for unknown reason 2d on CPU 2.
[ 468.794184] Do you have a strange power saving mode enabled?
[ 468.794184] Dazed and confused, but trying to continue
[ 473.190535] sd 0:0:0:0: [sda] tag#2 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
[ 473.199631] sd 0:0:0:0: [sda] tag#2 CDB: Write(10) 2a 00 39 93 49 d0 00 00 18 00
[ 473.207789] blk_update_request: I/O error, dev sda, sector 965954000
[ 473.214857] Aborting journal on device sda2-8.
[ 473.214868] EXT4-fs (sda2): ext4_writepages: jbd2_start: 7158 pages, ino 27394094; err -30
[ 473.214880] EXT4-fs (sda2): ext4_writepages: jbd2_start: 7168 pages, ino 27395265; err -30
[ 473.215802] EXT4-fs (sda2): ext4_writepages: jbd2_start: 7168 pages, ino 27394094; err -30
[ 473.215806] EXT4-fs (sda2): ext4_writepages: jbd2_start: 7168 pages, ino 27395265; err -30
[ 473.215811] EXT4-fs (sda2): ext4_writepages: jbd2_start: 7168 pages, ino 27394094; err -30
[ 473.215814] EXT4-fs (sda2): ext4_writepages: jbd2_start: 7168 pages, ino 27395265; err -30
[ 473.215849] EXT4-fs (sda2): ext4_writepages: jbd2_start: 9223372036854775807 pages, ino 27394094; err -30
[ 473.215859] EXT4-fs (sda2): ext4_writepages: jbd2_start: 9223372036854775807 pages, ino 27395265; err -30
[ 473.409076] EXT4-fs error (device sda2): ext4_journal_check_start:56: Detected aborted journal
[ 473.419003] EXT4-fs (sda2): Remounting filesystem read-only

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/