Re: [PATCH v6 00/39] kasan, vmalloc, arm64: add vmalloc tagging support for SW/HW_TAGS

From: Qian Cai
Date: Thu Apr 28 2022 - 10:14:07 EST


On Mon, Jan 24, 2022 at 07:02:08PM +0100, andrey.konovalov@xxxxxxxxx wrote:
> From: Andrey Konovalov <andreyknvl@xxxxxxxxxx>
>
> Hi,
>
> This patchset adds vmalloc tagging support for SW_TAGS and HW_TAGS
> KASAN modes.
>
> The tree with patches is available here:
>
> https://github.com/xairy/linux/tree/up-kasan-vmalloc-tags-v6
>
> About half of patches are cleanups I went for along the way. None of
> them seem to be important enough to go through stable, so I decided
> not to split them out into separate patches/series.
>
> The patchset is partially based on an early version of the HW_TAGS
> patchset by Vincenzo that had vmalloc support. Thus, I added a
> Co-developed-by tag into a few patches.
>
> SW_TAGS vmalloc tagging support is straightforward. It reuses all of
> the generic KASAN machinery, but uses shadow memory to store tags
> instead of magic values. Naturally, vmalloc tagging requires adding
> a few kasan_reset_tag() annotations to the vmalloc code.
>
> HW_TAGS vmalloc tagging support stands out. HW_TAGS KASAN is based on
> Arm MTE, which can only assigns tags to physical memory. As a result,
> HW_TAGS KASAN only tags vmalloc() allocations, which are backed by
> page_alloc memory. It ignores vmap() and others.

I could use some help here. Ever since this series, our system starts to
trigger bad page state bugs from time to time. Any thoughts?

BUG: Bad page state in process systemd-udevd pfn:83ffffcd
page:fffffc20fdfff340 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x83ffffcd
flags: 0xbfffc0000001000(reserved|node=0|zone=2|lastcpupid=0xffff)
raw: 0bfffc0000001000 fffffc20fdfff348 fffffc20fdfff348 0000000000000000
raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
page_owner info is not present (never set?)
CPU: 76 PID: 1873 Comm: systemd-udevd Not tainted 5.18.0-rc4-next-20220428-dirty #67
Call trace:
dump_backtrace
show_stack
dump_stack_lvl
dump_stack
bad_page
free_pcp_prepare
free_unref_page
__free_pages
free_pages.part.0
free_pages
kasan_depopulate_vmalloc_pte
(inlined by) kasan_depopulate_vmalloc_pte at mm/kasan/shadow.c:361
apply_to_pte_range
apply_to_pmd_range
apply_to_pud_range
__apply_to_page_range
apply_to_existing_page_range
kasan_release_vmalloc
(inlined by) kasan_release_vmalloc at mm/kasan/shadow.c:469
__purge_vmap_area_lazy
purge_vmap_area_lazy
alloc_vmap_area
__get_vm_area_node.constprop.0
__vmalloc_node_range
module_alloc
move_module
layout_and_allocate
load_module
__do_sys_finit_module
__arm64_sys_finit_module
invoke_syscall
el0_svc_common.constprop.0
do_el0_svc
el0_svc
el0t_64_sync_handler
el0t_64_sync
Disabling lock debugging due to kernel taint
BUG: Bad page state in process systemd-udevd pfn:83ffffcc
page:fffffc20fdfff300 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x83ffffcc
flags: 0xbfffc0000001000(reserved|node=0|zone=2|lastcpupid=0xffff)
raw: 0bfffc0000001000 fffffc20fdfff308 fffffc20fdfff308 0000000000000000
raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
page_owner info is not present (never set?)
CPU: 76 PID: 1873 Comm: systemd-udevd Tainted: G B 5.18.0-rc4-next-20220428-dirty #67
Call trace:
dump_backtrace
show_stack
dump_stack_lvl
dump_stack
bad_page
free_pcp_prepare
free_unref_page
__free_pages
free_pages.part.0
free_pages
kasan_depopulate_vmalloc_pte
apply_to_pte_range
apply_to_pmd_range
apply_to_pud_range
__apply_to_page_range
apply_to_existing_page_range
kasan_release_vmalloc
__purge_vmap_area_lazy
purge_vmap_area_lazy
alloc_vmap_area
__get_vm_area_node.constprop.0
__vmalloc_node_range
module_alloc
move_module
layout_and_allocate
load_module
__do_sys_finit_module
__arm64_sys_finit_module
invoke_syscall
el0_svc_common.constprop.0
do_el0_svc
el0_svc
el0t_64_sync_handler
el0t_64_sync