Re: [PATCH v9 02/10] mm: Non-pmd-mappable, large folios for folio_add_new_anon_rmap()

From: Jiri Olsa
Date: Sat Jan 13 2024 - 17:42:44 EST


On Thu, Dec 07, 2023 at 04:12:03PM +0000, Ryan Roberts wrote:
> In preparation for supporting anonymous multi-size THP, improve
> folio_add_new_anon_rmap() to allow a non-pmd-mappable, large folio to be
> passed to it. In this case, all contained pages are accounted using the
> order-0 folio (or base page) scheme.
>
> Reviewed-by: Yu Zhao <yuzhao@xxxxxxxxxx>
> Reviewed-by: Yin Fengwei <fengwei.yin@xxxxxxxxx>
> Reviewed-by: David Hildenbrand <david@xxxxxxxxxx>
> Reviewed-by: Barry Song <v-songbaohua@xxxxxxxx>
> Tested-by: Kefeng Wang <wangkefeng.wang@xxxxxxxxxx>
> Tested-by: John Hubbard <jhubbard@xxxxxxxxxx>
> Signed-off-by: Ryan Roberts <ryan.roberts@xxxxxxx>
> ---
> mm/rmap.c | 28 ++++++++++++++++++++--------
> 1 file changed, 20 insertions(+), 8 deletions(-)
>
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 2a1e45e6419f..846fc79f3ca9 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1335,32 +1335,44 @@ void page_add_anon_rmap(struct page *page, struct vm_area_struct *vma,
> * This means the inc-and-test can be bypassed.
> * The folio does not have to be locked.
> *
> - * If the folio is large, it is accounted as a THP. As the folio
> + * If the folio is pmd-mappable, it is accounted as a THP. As the folio
> * is new, it's assumed to be mapped exclusively by a single process.
> */
> void folio_add_new_anon_rmap(struct folio *folio, struct vm_area_struct *vma,
> unsigned long address)
> {
> - int nr;
> + int nr = folio_nr_pages(folio);
>
> - VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma);
> + VM_BUG_ON_VMA(address < vma->vm_start ||
> + address + (nr << PAGE_SHIFT) > vma->vm_end, vma);

hi,
I'm hitting this bug (console output below) with adding uprobe
on simple program like:

$ cat up.c
int main(void)
{
return 0;
}

# bpftrace -e 'uprobe:/home/jolsa/up:_start {}'

$ ./up

it's on top of current linus tree master:
052d534373b7 Merge tag 'exfat-for-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat

before this patch it seems to work, I can send my .config if needed

thanks,
jirka


---
[ 147.562264][ T719] vma ffff888166134e68 start 0000000000401000 end 0000000000402000 mm ffff88817cf2a840
[ 147.562264][ T719] prot 25 anon_vma ffff88817b6818e0 vm_ops ffffffff83475ec0
[ 147.562264][ T719] pgoff 1 file ffff888168d01240 private_data 0000000000000000
[ 147.562264][ T719] flags: 0x75(read|exec|mayread|maywrite|mayexec)
[ 147.571660][ T719] ------------[ cut here ]------------
[ 147.572319][ T719] kernel BUG at mm/rmap.c:1412!
[ 147.572825][ T719] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN NOPTI
[ 147.573792][ T719] CPU: 3 PID: 719 Comm: up Not tainted 6.7.0+ #273 faf755a6fc44b54f4ff1c207411fbd9df5a3968d
[ 147.574831][ T719] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc38 04/01/2014
[ 147.575652][ T719] RIP: 0010:folio_add_new_anon_rmap+0x2cc/0x8f0
[ 147.576164][ T719] Code: c7 c6 20 d2 38 83 48 89 df e8 c0 4a fb ff 0f 0b 48 89 ef e8 16 ab 08 00 4c 3b 65 00 0f 83 cd fd ff ff 48 89 ef e8 34 44 fb ff f7 c3 ff 0f 00 00 0f 85 de fe ff ff be 08 00 00 00 48 89 df
[ 147.577609][ T719] RSP: 0018:ffff88815759f568 EFLAGS: 00010286
[ 147.578140][ T719] RAX: 00000000000000fa RBX: ffffea00053eef40 RCX: 0000000000000000
[ 147.578825][ T719] RDX: 0000000000000000 RSI: ffffffff81289b44 RDI: ffffffff872ff1a0
[ 147.579513][ T719] RBP: ffff888166134e68 R08: 0000000000000001 R09: ffffed102aeb3e5f
[ 147.580198][ T719] R10: ffff88815759f2ff R11: 0000000000000000 R12: 0000000000401020
[ 147.580886][ T719] R13: 0000000000000001 R14: ffffea00053eef40 R15: ffffea00053eef40
[ 147.581566][ T719] FS: 0000000000000000(0000) GS:ffff88842ce00000(0000) knlGS:0000000000000000
[ 147.582263][ T719] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 147.582724][ T719] CR2: 00005634f0ffe880 CR3: 000000010c0f8002 CR4: 0000000000770ef0
[ 147.583304][ T719] PKRU: 55555554
[ 147.583586][ T719] Call Trace:
[ 147.583869][ T719] <TASK>
[ 147.584122][ T719] ? die+0x32/0x80
[ 147.584422][ T719] ? do_trap+0x12f/0x220
[ 147.584800][ T719] ? folio_add_new_anon_rmap+0x2cc/0x8f0
[ 147.585411][ T719] ? folio_add_new_anon_rmap+0x2cc/0x8f0
[ 147.585891][ T719] ? do_error_trap+0xa7/0x160
[ 147.586349][ T719] ? folio_add_new_anon_rmap+0x2cc/0x8f0
[ 147.586879][ T719] ? handle_invalid_op+0x2c/0x40
[ 147.587354][ T719] ? folio_add_new_anon_rmap+0x2cc/0x8f0
[ 147.587892][ T719] ? exc_invalid_op+0x29/0x40
[ 147.588352][ T719] ? asm_exc_invalid_op+0x16/0x20
[ 147.588847][ T719] ? preempt_count_sub+0x14/0xc0
[ 147.589335][ T719] ? folio_add_new_anon_rmap+0x2cc/0x8f0
[ 147.589899][ T719] ? folio_add_new_anon_rmap+0x2cc/0x8f0
[ 147.590437][ T719] __replace_page+0x364/0xb40
[ 147.590918][ T719] ? __pfx___replace_page+0x10/0x10
[ 147.591412][ T719] ? __pfx_lock_release+0x10/0x10
[ 147.591910][ T719] ? do_raw_spin_trylock+0xcd/0x120
[ 147.592555][ T719] ? __pfx_vma_alloc_folio+0x10/0x10
[ 147.593095][ T719] ? preempt_count_add+0x6e/0xc0
[ 147.593612][ T719] ? preempt_count_sub+0x14/0xc0
[ 147.594143][ T719] uprobe_write_opcode+0x3f6/0x820
[ 147.594616][ T719] ? __pfx_uprobe_write_opcode+0x10/0x10
[ 147.595125][ T719] ? preempt_count_sub+0x14/0xc0
[ 147.595551][ T719] ? up_write+0x125/0x2f0
[ 147.596014][ T719] install_breakpoint.isra.0+0xe5/0x470
[ 147.596635][ T719] uprobe_mmap+0x37b/0x8d0
[ 147.598111][ T719] ? __pfx_uprobe_mmap+0x10/0x10
[ 147.598561][ T719] mmap_region+0xa02/0x1220
[ 147.599013][ T719] ? rcu_is_watching+0x34/0x60
[ 147.599602][ T719] ? lock_acquired+0xbf/0x670
[ 147.600024][ T719] ? __pfx_mmap_region+0x10/0x10
[ 147.600458][ T719] ? security_mmap_addr+0x20/0x60
[ 147.600909][ T719] ? get_unmapped_area+0x169/0x1f0
[ 147.601353][ T719] do_mmap+0x425/0x660
[ 147.601739][ T719] vm_mmap_pgoff+0x15e/0x2b0
[ 147.602156][ T719] ? __pfx_vm_mmap_pgoff+0x10/0x10
[ 147.602597][ T719] ? __pfx_get_random_u64+0x10/0x10
[ 147.603059][ T719] elf_load+0xdc/0x3a0
[ 147.603433][ T719] load_elf_binary+0x6f6/0x22b0
[ 147.603889][ T719] ? __pfx_load_elf_binary+0x10/0x10
[ 147.604385][ T719] ? __pfx_lock_acquired+0x10/0x10
[ 147.604952][ T719] bprm_execve+0x494/0xc80
[ 147.605379][ T719] ? __pfx_bprm_execve+0x10/0x10
[ 147.605843][ T719] do_execveat_common.isra.0+0x24f/0x330
[ 147.606358][ T719] __x64_sys_execve+0x52/0x60
[ 147.606797][ T719] do_syscall_64+0x87/0x1b0
[ 147.607148][ T719] entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 147.607630][ T719] RIP: 0033:0x7faa9b0bdb4b
[ 147.608732][ T719] Code: Unable to access opcode bytes at 0x7faa9b0bdb21.
[ 147.609318][ T719] RSP: 002b:00007ffff9921708 EFLAGS: 00000246 ORIG_RAX: 000000000000003b
[ 147.610046][ T719] RAX: ffffffffffffffda RBX: 00005634f1964990 RCX: 00007faa9b0bdb4b
[ 147.610727][ T719] RDX: 00005634f1966d20 RSI: 00005634f19612c0 RDI: 00005634f1964990
[ 147.611528][ T719] RBP: 00007ffff9921800 R08: 0000000000000001 R09: 0000000000000001
[ 147.612192][ T719] R10: 0000000000000008 R11: 0000000000000246 R12: 00000000ffffffff
[ 147.612829][ T719] R13: 00005634f1964990 R14: 00005634f19612c0 R15: 00005634f1966d20
[ 147.613479][ T719] </TASK>
[ 147.613787][ T719] Modules linked in: intel_rapl_msr intel_rapl_common crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel kvm_intel rapl iTiTCO_vendor_support i2c_i801 i2c_smbus lpc_ich drm loop drm_panel_orientation_quirks zram
[ 147.615630][ T719] ---[ end trace 0000000000000000 ]---
[ 147.616253][ T719] RIP: 0010:folio_add_new_anon_rmap+0x2cc/0x8f0
[ 147.616714][ T719] Code: c7 c6 20 d2 38 83 48 89 df e8 c0 4a fb ff 0f 0b 48 89 ef e8 16 ab 08 00 4c 3b 65 00 0f 83 cd fd ff ff 48 89 ef e8 34 44 fb ff f7 c3 ff 0f 00 00 0f 85 de fe ff ff be 08 00 00 00 48 89 df
[ 147.618160][ T719] RSP: 0018:ffff88815759f568 EFLAGS: 00010286
[ 147.618594][ T719] RAX: 00000000000000fa RBX: ffffea00053eef40 RCX: 0000000000000000
[ 147.619318][ T719] RDX: 0000000000000000 RSI: ffffffff81289b44 RDI: ffffffff872ff1a0
[ 147.619930][ T719] RBP: ffff888166134e68 R08: 0000000000000001 R09: ffffed102aeb3e5f
[ 147.620577][ T719] R10: ffff88815759f2ff R11: 0000000000000000 R12: 0000000000401020
[ 147.621236][ T719] R13: 0000000000000001 R14: ffffea00053eef40 R15: ffffea00053eef40
[ 147.621894][ T719] FS: 0000000000000000(0000) GS:ffff88842ce00000(0000) knlGS:0000000000000000
[ 147.622596][ T719] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 147.623186][ T719] CR2: 00007faa9b0bdb21 CR3: 000000010c0f8002 CR4: 0000000000770ef0
[ 147.623960][ T719] PKRU: 55555554
[ 147.624331][ T719] note: up[719] exited with preempt_count 1
[ 147.624953][ T719] BUG: sleeping function called from invalid context at include/linux/percpu-rwsem.h:49
[ 147.625898][ T719] in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 719, name: up
[ 147.626672][ T719] preempt_count: 0, expected: 0
[ 147.627945][ T719] RCU nest depth: 1, expected: 0
[ 147.628410][ T719] INFO: lockdep is turned off.
[ 147.628898][ T719] CPU: 3 PID: 719 Comm: up Tainted: G D 6.7.0+ #273 faf755a6fc44b54f4ff1c207411fbd9df5a3968d
[ 147.629954][ T719] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc38 04/01/2014
[ 147.630838][ T719] Call Trace:
[ 147.631185][ T719] <TASK>
[ 147.631514][ T719] dump_stack_lvl+0x15d/0x180
[ 147.631973][ T719] __might_resched+0x270/0x3b0
[ 147.636533][ T719] exit_signals+0x1d/0x460
[ 147.636947][ T719] do_exit+0x27f/0x13b0
[ 147.637368][ T719] ? __pfx__printk+0x10/0x10
[ 147.637827][ T719] ? __pfx_do_exit+0x10/0x10
[ 147.638238][ T719] make_task_dead+0xd9/0x240
[ 147.638610][ T719] rewind_stack_and_make_dead+0x17/0x20
[ 147.639064][ T719] RIP: 0033:0x7faa9b0bdb4b
[ 147.639445][ T719] Code: Unable to access opcode bytes at 0x7faa9b0bdb21.
[ 147.640015][ T719] RSP: 002b:00007ffff9921708 EFLAGS: 00000246 ORIG_RAX: 000000000000003b
[ 147.640694][ T719] RAX: ffffffffffffffda RBX: 00005634f1964990 RCX: 00007faa9b0bdb4b
[ 147.641407][ T719] RDX: 00005634f1966d20 RSI: 00005634f19612c0 RDI: 00005634f1964990
[ 147.642133][ T719] RBP: 00007ffff9921800 R08: 0000000000000001 R09: 0000000000000001
[ 147.642911][ T719] R10: 0000000000000008 R11: 0000000000000246 R12: 00000000ffffffff
[ 147.643685][ T719] R13: 00005634f1964990 R14: 00005634f19612c0 R15: 00005634f1966d20
[ 147.644454][ T719] </TASK>
[ 147.644819][ T719] ------------[ cut here ]------------