Re: Fwd: kernel bug when performing heavy IO operations

From: dianlujitao
Date: Wed Sep 27 2023 - 01:47:29 EST


Hello, I got some logs with 6.5.4 kernel from the official linux package of Arch, no zen patches this time. Full dmesg is uploaded to https://fars.ee/F1yM and below is a small snippet for your convenience, from which PG_offline is no longer set:

[177850.039441] BUG: Bad page map in process ld.lld pte:8000000edacc4025 pmd:147f96067
[177850.039454] page:000000007415dd6c refcount:22 mapcount:-237 mapping:00000000b0c37ca6 index:0x1075 pfn:0xedacc4
[177850.039460] memcg:ffff9289345d4000
[177850.039463] aops:btrfs_aops [btrfs] ino:fb2b838 dentry name:"lld"
[177850.039592] flags: 0xaffff9800002056(referenced|uptodate|lru|workingset|private|node=1|zone=2|lastcpupid=0xffff)
[177850.039597] page_type: 0xffffff12(buddy|0x6d)
[177850.039602] raw: 0affff9800002056 ffffe7623b6b3148 ffffe7623b6b3088 ffff928202a9fc10
[177850.039605] raw: 0000000000001075 0000000000000001 00000016ffffff12 ffff9289345d4000
[177850.039607] page dumped because: bad pte
[177850.039608] addr:0000000001275000 vm_flags:08000071 anon_vma:0000000000000000 mapping:ffff928202a9fc10 index:1075
[177850.039612] file:lld fault:filemap_fault mmap:btrfs_file_mmap [btrfs] read_folio:btrfs_read_folio [btrfs]
[177850.039846] CPU: 40 PID: 2060138 Comm: ld.lld Tainted: G           OE      6.5.4-arch2-1 #1 a30a3b4701899b64bf6025fd97642e50bf2dcad4
[177850.039851] Hardware name: JGINYUE X99-8D3/2.5G Server/X99-8D3/2.5G Server, BIOS 5.11 06/30/2022
[177850.039853] Call Trace:
[177850.039857]  <TASK>
[177850.039864]  dump_stack_lvl+0x47/0x60
[177850.039871]  print_bad_pte+0x1bc/0x280
[177850.039879]  ? page_remove_rmap+0x8d/0x260
[177850.039885]  unmap_page_range+0xa96/0x1150
[177850.039894]  unmap_vmas+0xf8/0x190
[177850.039900]  exit_mmap+0xe4/0x310
[177850.039909]  __mmput+0x3e/0x130
[177850.039916]  do_exit+0x31c/0xb20
[177850.039920]  ? futex_wait_queue+0x63/0x90
[177850.039927]  do_group_exit+0x31/0x80
[177850.039932]  get_signal+0x9a5/0x9e0
[177850.039941]  arch_do_signal_or_restart+0x3e/0x270
[177850.039947]  exit_to_user_mode_prepare+0x185/0x1e0
[177850.039955]  syscall_exit_to_user_mode+0x1b/0x40
[177850.039962]  do_syscall_64+0x6c/0x90
[177850.039969]  ? do_futex+0x128/0x190
[177850.039973]  ? __x64_sys_futex+0x129/0x1e0
[177850.039977]  ? switch_fpu_return+0x50/0xe0
[177850.039986]  ? syscall_exit_to_user_mode+0x2b/0x40
[177850.039991]  ? do_syscall_64+0x6c/0x90
[177850.039996]  ? exc_page_fault+0x7f/0x180
[177850.040002]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[177850.040009] RIP: 0033:0x7f1851e164ae
[177850.040056] Code: Unable to access opcode bytes at 0x7f1851e16484.
[177850.040058] RSP: 002b:00007f18227fbd30 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
[177850.040063] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f1851e164ae
[177850.040066] RDX: 0000000000000000 RSI: 0000000000000189 RDI: 0000000005818134
[177850.040068] RBP: 0000000000000000 R08: 0000000000000000 R09: 00000000ffffffff
[177850.040070] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[177850.040072] R13: 00000000058180e0 R14: 0000000000000000 R15: 0000000005818134
[177850.040078]  </TASK>
[177850.040112] Disabling lock debugging due to kernel taint

在 2023/8/27 19:54, Matthew Wilcox 写道:
On Sun, Aug 27, 2023 at 12:34:54PM +0800, dianlujitao wrote:
在 2023/8/27 11:45, Matthew Wilcox 写道:
On Sun, Aug 27, 2023 at 10:20:51AM +0700, Bagas Sanjaya wrote:
When the IO load is heavy (compiling AOSP in my case), there's a chance to crash the kernel, the only way to recover is to perform a hard reset. Logs look like follows:

8月 25 13:52:23 arch-pc kernel: BUG: Bad page map in process tmux: client pte:8000000462500025 pmd:b99c98067
8月 25 13:52:23 arch-pc kernel: page:00000000460fa108 refcount:4 mapcount:-256 mapping:00000000612a1864 index:0x16 pfn:0x462500
8月 25 13:52:23 arch-pc kernel: memcg:ffff8a1056ed0000
8月 25 13:52:23 arch-pc kernel: aops:btrfs_aops [btrfs] ino:9c4635 dentry name:"locale-archive"
8月 25 13:52:23 arch-pc kernel: flags: 0x2ffff5800002056(referenced|uptodate|lru|workingset|private|node=0|zone=2|lastcpupid=0xffff)
8月 25 13:52:23 arch-pc kernel: page_type: 0xfffffeff(offline)
This is interesting. PG_offline is set.

$ git grep SetPageOffline
arch/powerpc/platforms/powernv/memtrace.c: __SetPageOffline(pfn_to_page(pfn));
drivers/hv/hv_balloon.c: __SetPageOffline(pg);
drivers/hv/hv_balloon.c: __SetPageOffline(pg + j);
drivers/misc/vmw_balloon.c: __SetPageOffline(page + i);
drivers/virtio/virtio_mem.c: __SetPageOffline(page);
drivers/xen/balloon.c: __SetPageOffline(page);
include/linux/balloon_compaction.h: __SetPageOffline(page);
include/linux/balloon_compaction.h: __SetPageOffline(page);

But there's no indication that this kernel is running under a
hypervisor:

8月 25 13:52:23 arch-pc kernel: Hardware name: JGINYUE X99-8D3/2.5G Server/X99-8D3/2.5G Server, BIOS 5.11 06/30/2022
Yes, I'm running on bare metal hardware.
So I'd agree with Artem, this looks like bad RAM.

I ran memtest86+ 6.20 for a cycle and it passed. However, could an OOM
trigger the bug? e.g., kernel bug fired before the OOM killer has a
chance to start? Just a guess because the last log entry in journalctl
before "BUG" is an hour earlier.
The problem is that OOM doesn't SetPageOffline. The only things that
do are hypervisor guest drivers. So we've got a random bit being
cleared, and either that's a stray write which happens to land in
the struct page in question, or it's bad hardware. Since it's a
single bit that's being cleared, bad hardware is the most likely
explanation, but it's not impossible for there to be a bug that's
doing this. The problem is that it could be almost anything ...