On Wed, 2019-07-10 at 17:43 -0400, Qian Cai wrote:
Running LTP oom01 test case with swap triggers a crash below. Revert theYou might want to look harder on this commit, as reverted it alone on the top of
series
"Make deferred split shrinker memcg aware" [1] seems fix the issue.
5.2.0-next-20190711 fixed the issue.
aefde94195ca mm: thp: make deferred split shrinker memcg aware [1]
[1] https://lore.kernel.org/linux-mm/1561507361-59349-5-git-send-email-yang.shi@
linux.alibaba.com/
list_del corruption. prev->next should be ffffea0022b10098, but was
0000000000000000
[ÂÂ685.284254][ T3456] ------------[ cut here ]------------
[ÂÂ685.289616][ T3456] kernel BUG at lib/list_debug.c:53!
[ÂÂ685.294808][ T3456] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI
[ÂÂ685.301998][ T3456] CPU: 5 PID: 3456 Comm: oom01 Tainted:
GÂÂÂÂÂÂÂÂWÂÂÂÂÂÂÂÂÂ5.2.0-next-20190711+ #3
[ÂÂ685.311193][ T3456] Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385
Gen10, BIOS A40 06/24/2019
[ÂÂ685.320485][ T3456] RIP: 0010:__list_del_entry_valid+0x8b/0xb6
[ÂÂ685.326364][ T3456] Code: f1 e0 ff 49 8b 55 08 4c 39 e2 75 2c 5b b8 01 00 00
00 41 5c 41 5d 5d c3 4c 89 e2 48 89 de 48 c7 c7 c0 5a 73 a3 e8 d9 fa bc ff <0f>
0b 48 c7 c7 60 a0 e1 a3 e8 13 52 01 00 4c 89 e6 48 c7 c7 20 5b
[ÂÂ685.345956][ T3456] RSP: 0018:ffff888e0c8a73c0 EFLAGS: 00010082
[ÂÂ685.351920][ T3456] RAX: 0000000000000054 RBX: ffffea0022b10098 RCX:
ffffffffa2d5d708
[ÂÂ685.359807][ T3456] RDX: 0000000000000000 RSI: 0000000000000008 RDI:
ffff8888442bd380
[ÂÂ685.367693][ T3456] RBP: ffff888e0c8a73d8 R08: ffffed1108857a71 R09:
ffffed1108857a70
[ÂÂ685.375577][ T3456] R10: ffffed1108857a70 R11: ffff8888442bd387 R12:
0000000000000000
[ÂÂ685.383462][ T3456] R13: 0000000000000000 R14: ffffea0022b10034 R15:
ffffea0022b10098
[ÂÂ685.391348][ T3456] FS:ÂÂ00007fbe26db4700(0000) GS:ffff888844280000(0000)
knlGS:0000000000000000
[ÂÂ685.400194][ T3456] CS:ÂÂ0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ÂÂ685.406681][ T3456] CR2: 00007fbcabb3f000 CR3: 0000001012e44000 CR4:
00000000001406a0
[ÂÂ685.414563][ T3456] Call Trace:
[ÂÂ685.417736][ T3456]ÂÂdeferred_split_scan+0x337/0x740
[ÂÂ685.422741][ T3456]ÂÂ? split_huge_page_to_list+0xe10/0xe10
[ÂÂ685.428272][ T3456]ÂÂ? __radix_tree_lookup+0x12d/0x1e0
[ÂÂ685.433453][ T3456]ÂÂ? node_tag_get.part.0.constprop.6+0x40/0x40
[ÂÂ685.439505][ T3456]ÂÂdo_shrink_slab+0x244/0x5a0
[ÂÂ685.444071][ T3456]ÂÂshrink_slab+0x253/0x440
[ÂÂ685.448375][ T3456]ÂÂ? unregister_shrinker+0x110/0x110
[ÂÂ685.453551][ T3456]ÂÂ? kasan_check_read+0x11/0x20
[ÂÂ685.458291][ T3456]ÂÂ? mem_cgroup_protected+0x20f/0x260
[ÂÂ685.463555][ T3456]ÂÂshrink_node+0x31e/0xa30
[ÂÂ685.467858][ T3456]ÂÂ? shrink_node_memcg+0x1560/0x1560
[ÂÂ685.473036][ T3456]ÂÂ? ktime_get+0x93/0x110
[ÂÂ685.477250][ T3456]ÂÂdo_try_to_free_pages+0x22f/0x820
[ÂÂ685.482338][ T3456]ÂÂ? shrink_node+0xa30/0xa30
[ÂÂ685.486815][ T3456]ÂÂ? kasan_check_read+0x11/0x20
[ÂÂ685.491556][ T3456]ÂÂ? check_chain_key+0x1df/0x2e0
[ÂÂ685.496383][ T3456]ÂÂtry_to_free_pages+0x242/0x4d0
[ÂÂ685.501209][ T3456]ÂÂ? do_try_to_free_pages+0x820/0x820
[ÂÂ685.506476][ T3456]ÂÂ__alloc_pages_nodemask+0x9ce/0x1bc0
[ÂÂ685.511826][ T3456]ÂÂ? gfp_pfmemalloc_allowed+0xc0/0xc0
[ÂÂ685.517089][ T3456]ÂÂ? kasan_check_read+0x11/0x20
[ÂÂ685.521826][ T3456]ÂÂ? check_chain_key+0x1df/0x2e0
[ÂÂ685.526657][ T3456]ÂÂ? do_anonymous_page+0x343/0xe30
[ÂÂ685.531658][ T3456]ÂÂ? lock_downgrade+0x390/0x390
[ÂÂ685.536399][ T3456]ÂÂ? get_kernel_page+0xa0/0xa0
[ÂÂ685.541050][ T3456]ÂÂ? __lru_cache_add+0x108/0x160
[ÂÂ685.545879][ T3456]ÂÂalloc_pages_vma+0x89/0x2c0
[ÂÂ685.550444][ T3456]ÂÂdo_anonymous_page+0x3e1/0xe30
[ÂÂ685.555271][ T3456]ÂÂ? __update_load_avg_cfs_rq+0x2c/0x490
[ÂÂ685.560796][ T3456]ÂÂ? finish_fault+0x120/0x120
[ÂÂ685.565361][ T3456]ÂÂ? alloc_pages_vma+0x21e/0x2c0
[ÂÂ685.570187][ T3456]ÂÂhandle_pte_fault+0x457/0x12c0
[ÂÂ685.575014][ T3456]ÂÂ__handle_mm_fault+0x79a/0xa50
[ÂÂ685.579841][ T3456]ÂÂ? vmf_insert_mixed_mkwrite+0x20/0x20
[ÂÂ685.585280][ T3456]ÂÂ? kasan_check_read+0x11/0x20
[ÂÂ685.590021][ T3456]ÂÂ? __count_memcg_events+0x8b/0x1c0
[ÂÂ685.595196][ T3456]ÂÂhandle_mm_fault+0x17f/0x370
[ÂÂ685.599850][ T3456]ÂÂ__do_page_fault+0x25b/0x5d0
[ÂÂ685.604501][ T3456]ÂÂdo_page_fault+0x4c/0x2cf
[ÂÂ685.608892][ T3456]ÂÂ? page_fault+0x5/0x20
[ÂÂ685.613019][ T3456]ÂÂpage_fault+0x1b/0x20
[ÂÂ685.617058][ T3456] RIP: 0033:0x410be0
[ÂÂ685.620840][ T3456] Code: 89 de e8 e3 23 ff ff 48 83 f8 ff 0f 84 86 00 00 00
48 89 c5 41 83 fc 02 74 28 41 83 fc 03 74 62 e8 95 29 ff ff 31 d2 48 98 90 <c6>
44 15 00 07 48 01 c2 48 39 d3 7f f3 31 c0 5b 5d 41 5c c3 0f 1f
[ÂÂ68[ÂÂ687.120156][ T3456] Shutting down cpus with NMI
[ÂÂ687.124731][ T3456] Kernel Offset: 0x21800000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ÂÂ687.136389][ T3456] ---[ end Kernel panic - not syncing: Fatal exception ]---