2.6.38.8 kernel bug in XFS or megaraid driver with heavy I/O load

From: Anders Ossowicki
Date: Tue Oct 11 2011 - 05:18:13 EST


We seem to have hit a bug on our brand-new disk with an XFS filesystem on the
2.6.38.8 kernel. The disk is 2 Dell MD1220 enclosures with Intel SSDs daisy
chained behind an LSI MegaRAID SAS 9285-8e raid controller. It was under heavy
I/O load, 1-200 MB/s r/w from postgres for about a week before the bug showed
up. The system itself is a Dell PowerEdge R815 with 32 cpu cores and 256G
memory.

Support for the 9285-8e controller was introduced as part of a series of
patches for drivers/scsi/megaraid in 2.6.38 (0d49016b..cd50ba8e). Given that
the megaraid driver support for the 9285-8e controller is so new it might be
the real source of the issue, but this is pure speculation on my part. Any
suggestions would be most welcome.

The full dmesg is available at
http://dev.exherbo.org/~arkanoid/kat-dmesg-2011-10.txt

BUG: unable to handle kernel paging request at 000000000040403c
IP: [<ffffffff810f8d71>] find_get_pages+0x61/0x110
PGD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/system/cpu/cpu31/cache/index2/shared_cpu_map
CPU 11
Modules linked in: btrfs zlib_deflate crc32c libcrc32c ufs qnx4 hfsplus hfs
minix ntfs vfat msdos fat jfs xfs reiserfs nfsd exportfs nfs lockd nfs_acl
auth_rpcgss sunrpc autofs4 psmouse serio_raw joydev ixgbe lp amd64_edac_mod
i2c_piix4 dca parport edac_core bnx2 power_meter dcdbas mdio edac_mce_amd ses
enclosure usbhid hid ahci mpt2sas libahci scsi_transport_sas megaraid_sas
raid_class

Pid: 27512, comm: flush-8:32 Tainted: G W 2.6.38.8 #1 Dell Inc.
PowerEdge R815/04Y8PT
RIP: 0010:[<ffffffff810f8d71>] [<ffffffff810f8d71>] find_get_pages+0x61/0x110
RSP: 0018:ffff881fdee55800 EFLAGS: 00010246
RAX: ffff8814a66d7000 RBX: ffff881fdee558c0 RCX: 000000000000000e
RDX: 0000000000000005 RSI: 0000000000000001 RDI: 0000000000404034
RBP: ffff881fdee55850 R08: 0000000000000001 R09: 0000000000000002
R10: ffffea00a0ff7788 R11: ffff88129306ac88 R12: 0000000000031535
R13: 000000000000000e R14: ffff881fdee558e8 R15: 0000000000000005
FS: 00007fec9ce13720(0000) GS:ffff88181fc80000(0000) knlGS:00000000f744d6d0
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000040403c CR3: 0000000001a03000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process flush-8:32 (pid: 27512, threadinfo ffff881fdee54000, task ffff881fdf4adb80)
Stack:
0000000000000000 0000000000000000 0000000000000000 ffff8832e7edf6e0
0000000000000000 ffff881fdee558b0 ffffea008b443c18 0000000000031535
ffff8832e7edf590 ffff881fdee55d20 ffff881fdee55870 ffffffff81101f92
Call Trace:
[<ffffffff81101f92>] pagevec_lookup+0x22/0x30
[<ffffffffa033e00d>] xfs_cluster_write+0xad/0x180 [xfs]
[<ffffffffa033e4f4>] xfs_vm_writepage+0x414/0x4f0 [xfs]
[<ffffffff810ffb77>] __writepage+0x17/0x40
[<ffffffff81100d95>] write_cache_pages+0x1c5/0x4a0
[<ffffffff810ffb60>] ? __writepage+0x0/0x40
[<ffffffff81101094>] generic_writepages+0x24/0x30
[<ffffffffa033d5dd>] xfs_vm_writepages+0x5d/0x80 [xfs]
[<ffffffff811010c1>] do_writepages+0x21/0x40
[<ffffffff811730bf>] writeback_single_inode+0x9f/0x250
[<ffffffff8117370b>] writeback_sb_inodes+0xcb/0x170
[<ffffffff81174174>] writeback_inodes_wb+0xa4/0x170
[<ffffffff8117450b>] wb_writeback+0x2cb/0x440
[<ffffffff81035bb9>] ? default_spin_lock_flags+0x9/0x10
[<ffffffff8158b3af>] ? _raw_spin_lock_irqsave+0x2f/0x40
[<ffffffff811748ac>] wb_do_writeback+0x22c/0x280
[<ffffffff811749aa>] bdi_writeback_thread+0xaa/0x260
[<ffffffff81174900>] ? bdi_writeback_thread+0x0/0x260
[<ffffffff81081b76>] kthread+0x96/0xa0
[<ffffffff8100cda4>] kernel_thread_helper+0x4/0x10
[<ffffffff81081ae0>] ? kthread+0x0/0xa0
[<ffffffff8100cda0>] ? kernel_thread_helper+0x0/0x10
Code: 4e 1c 00 85 c0 89 c1 0f 84 a7 00 00 00 49 89 de 45 31 ff 31 d2 0f 1f 44
00 00 49 8b 06 48 8b 38 48 85 ff 74 3d 40 f6 c7 01 75 54 <44> 8b 47 08 4c 8d 57
08 45 85 c0 74 e5 45 8d 48 01 44 89 c0 f0
RIP [<ffffffff810f8d71>] find_get_pages+0x61/0x110
RSP <ffff881fdee55800>
CR2: 000000000040403c
---[ end trace 84193c2a431ae14b ]---
--
Anders Ossowicki

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/