Re: [LTP] [BUG] Unable to handle kernel paging request for unaligned access at address 0xc0000001c52c53df

From: Eryu Guan
Date: Tue Jun 06 2017 - 23:27:45 EST


On Tue, Jun 06, 2017 at 06:00:34PM +0800, Li Wang wrote:
> Hi,
>
> ltp/access04 always panic the latest mainstream kernel-4.12-rc4 on
> ppc64le. From the calltrace
> I guess the reason is probably that the tests mount ext2 file system
> using ext4 driver.
>
> A simple way to reproduce:
>
> # dd of=wangli if=/dev/zero count=1024 bs=1024
> # mkfs -t ext2 wangli
> # mount -t ext4 wangli /mnt/

I can't reproduce this crash either by your reproducer nor by ltp
access04 test on ppc64le host.

>
>
> Are there any new changes in ext4 (on kernel-4.12-rc4) recently?

I don't think it's an ext4 bug, I've seen similar crashes twice in
4.12-rc4 kernel testings, once testing XFS running fstests, and once
running ltp on ext3. But it seems not related to filesystem code.

[ 828.119270] run fstests generic/034 at 2017-06-06 19:16:10
[ 828.720341] XFS (sda5): Unmounting Filesystem
[ 828.814003] device-mapper: uevent: version 1.0.3
[ 828.814096] Unable to handle kernel paging request for unaligned access at address 0xc0000001c52c5e7f
[ 828.814103] Faulting instruction address: 0xc0000000004d214c
[ 828.814109] Oops: Kernel access of bad area, sig: 7 [#1]
[ 828.814113] SMP NR_CPUS=2048
[ 828.814114] NUMA
[ 828.814117] pSeries
[ 828.814122] Modules linked in: dm_mod(+) sg pseries_rng ghash_generic gf128mul xts vmx_crypto nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod ibmvscsi ibmveth scsi_transport_srp
[ 828.814150] CPU: 10 PID: 137772 Comm: modprobe Not tainted 4.12.0-rc4 #1
[ 828.814155] task: c0000003fe13c800 task.stack: c00000046ec68000
[ 828.814163] NIP: c0000000004d214c LR: c00000000011c884 CTR: c000000000130900
[ 828.814168] REGS: c00000046ec6b3d0 TRAP: 0600 Not tainted (4.12.0-rc4)
[ 828.814173] MSR: 800000010280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>
[ 828.814184] CR: 28228244 XER: 00000005
[ 828.814191] CFAR: c00000000011c880 DAR: c0000001c52c5e7f DSISR: 00000000 SOFTE: 0
[ 828.814191] GPR00: c00000000011c848 c00000046ec6b650 c000000001049100 c0000003f3b77020
[ 828.814191] GPR04: c0000003f3b77020 c0000001c52c5e7f 0000000000000000 0000000000000001
[ 828.814191] GPR08: 0008f92d89943c42 00000024000048b7 0000000000000008 0000000000000000
[ 828.814191] GPR12: c000000000130900 c00000000fac6900 d000000007dd3908 d000000007dd3908
[ 828.814191] GPR16: c00000046ec6bdec c00000046ec6bda0 000000000000ff20 0000000000000000
[ 828.814191] GPR20: 00000000000052f8 0000000000000000 0000000000004000 c000000000cc5780
[ 828.814191] GPR24: 00000001c45ffc5f 0000000000000000 00000001c45ffc5f c00000000107dd00
[ 828.814191] GPR28: c0000003f3b77834 0000000000000004 0000000000000800 c0000003f3b77000
[ 828.814257] NIP [c0000000004d214c] llist_add_batch+0xc/0x40
[ 828.814263] LR [c00000000011c884] try_to_wake_up+0x4a4/0x5b0
[ 828.814268] Call Trace:
[ 828.814273] [c00000046ec6b650] [c00000000011c848] try_to_wake_up+0x468/0x5b0 (unreliable)
[ 828.814282] [c00000046ec6b6d0] [c000000000102828] create_worker+0x148/0x250
[ 828.814290] [c00000046ec6b770] [c0000000001059dc] alloc_unbound_pwq+0x3bc/0x4c0
[ 828.814296] [c00000046ec6b7d0] [c00000000010601c] apply_wqattrs_prepare+0x2ac/0x320
[ 828.814304] [c00000046ec6b840] [c0000000001060cc] apply_workqueue_attrs_locked+0x3c/0xa0
[ 828.814313] [c00000046ec6b870] [c00000000010662c] apply_workqueue_attrs+0x4c/0x80
[ 828.814322] [c00000046ec6b8b0] [c0000000001081cc] __alloc_workqueue_key+0x16c/0x4e0
[ 828.814343] [c00000046ec6b970] [d000000007e04748] local_init+0xdc/0x1a4 [dm_mod]
[ 828.814362] [c00000046ec6b9f0] [d000000007e04854] dm_init+0x44/0xc4 [dm_mod]
[ 828.814375] [c00000046ec6ba30] [c00000000000ccf0] do_one_initcall+0x60/0x1c0
[ 828.814390] [c00000046ec6baf0] [c00000000091e748] do_init_module+0x8c/0x244
[ 828.814405] [c00000046ec6bb80] [c000000000197e08] load_module+0x12f8/0x1600
[ 828.814414] [c00000046ec6bd30] [c000000000198388] SyS_finit_module+0xa8/0x110
[ 828.814424] [c00000046ec6be30] [c00000000000af84] system_call+0x38/0xe0
[ 828.814429] Instruction dump:
[ 828.814436] 60420000 38600000 4e800020 60000000 60420000 7c832378 4e800020 60000000
[ 828.814448] 60000000 e9250000 f9240000 7c0004ac <7d4028a8> 7c2a4800 40c20010 7c6029ad
[ 828.814466] ---[ end trace 87ec4ff1fa8e1a3d ]---

I suspect it's a regression introduced in 4.12-rc4 kernel, I didn't see
such crashes when testing 4.12-rc3 kernel. I'll do bisect once I worked
out a reliable reproducer (unless you can reliably reproduce it with
your reproducer :).

Thanks,
Eryu