Re: qemu-x86_64 booting with 8.0.0 stil see int3: when running LTP tracing testing.

From: Peter Zijlstra
Date: Wed Jun 21 2023 - 12:07:09 EST


On Wed, Jun 21, 2023 at 05:31:15PM +0200, Arnd Bergmann wrote:

> I don't know much about x86 exception handling, but my guess is
> that this is where the stack overflows, so this backtrace
> is not all that useful. Looking at the full log from your link,
> I see that recursion through asm_exc_int3:
>
> <4>[ 49.886694] ? exc_int3+0x62/0x80
> <4>[ 49.886714] ? asm_exc_int3+0x3e/0x50
> <4>[ 49.886759] ? preempt_count_sub+0x5/0x80
> <4>[ 49.886783] ? preempt_count_sub+0x5/0x80
> <4>[ 49.886805] ? irq_work_queue+0x40/0x80
> <4>[ 49.886826] ? defer_console_output+0x49/0x80
> <4>[ 49.886840] ? vprintk+0x42/0x60
> <4>[ 49.886857] ? _printk+0x5d/0x80
> <4>[ 49.886891] ? die+0x9c/0xe0
> <4>[ 49.886922] ? exc_int3+0x62/0x80
> <4>[ 49.886942] ? asm_exc_int3+0x3e/0x50

So this is exc_int3(), but it is past poke_int3_handler() which is the
counter-part of text_poke_bp_batch() -- die() seems to suggest we're
even past do_int3().

The fact that it continued past poke_int3_handler() suggests that it is
a 'stray' int3, not covered by an active text modification site.

> <4>[ 49.886966] ? __pfx_tick_sched_timer+0x10/0x10
> <4>[ 49.886980] ? __pfx_read_tsc+0x10/0x10
> <4>[ 49.887010] ? preempt_count_sub+0x5/0x80
> <4>[ 49.887035] ? preempt_count_sub+0x5/0x80
> <4>[ 49.887057] ? __hrtimer_run_queues+0xee/0x330
> <4>[ 49.887070] ? _raw_spin_unlock_irqrestore+0x28/0x50
> <4>[ 49.887085] ? __hrtimer_run_queues+0xee/0x330
> <4>[ 49.887136] ? hrtimer_interrupt+0xf6/0x390
> <4>[ 49.887167] ? __sysvec_apic_timer_interrupt+0x64/0x1a0
> <4>[ 49.887189] ? sysvec_apic_timer_interrupt+0x7a/0x90
> <4>[ 49.887206] </IRQ>
> <4>[ 49.887219] <TASK>
> <4>[ 49.887231] ? asm_sysvec_apic_timer_interrupt+0x1f/0x30
> <4>[ 49.887252] ? sched_rt_period_timer+0x4/0x390
> <4>[ 49.887272] ? __pfx_do_sync_core+0x10/0x10
> <4>[ 49.887305] ? insn_get_displacement+0x9/0x160
> <4>[ 49.887329] ? insn_get_displacement+0x9/0x160
> <4>[ 49.887344] ? insn_get_immediate+0xd2/0x270
> <4>[ 49.887365] ? insn_decode+0x113/0x150
> <4>[ 49.887384] ? text_poke_loc_init+0xea/0x220
> <4>[ 49.887429] ? sched_rt_period_timer+0x4/0x390
> <4>[ 49.887451] ? text_poke_queue+0x89/0xa0

This is the text_poke_queue() stage, it is still collecting sites to
poke but hasn't actually started yet. The actual poking happens in
text_poke_bp_batch(), which isn't in the call-chain afaict.

> <4>[ 49.887477] ? ftrace_replace_code+0x149/0x1f0
> <4>[ 49.887508] ? ftrace_modify_all_code+0x71/0x140
> <4>[ 49.887533] ? arch_ftrace_update_code+0xd/0x20
> <4>[ 49.887550] ? ftrace_shutdown+0xf5/0x220
> <4>[ 49.887577] ? unregister_ftrace_function+0x2e/0x150
> <4>[ 49.887608] ? stack_trace_sysctl+0x82/0xb0
> <4>[ 49.887636] ? proc_sys_call_handler+0x18b/0x280
> <4>[ 49.887675] ? proc_sys_write+0x17/0x20
> <4>[ 49.887692] ? vfs_write+0x324/0x3f0
> <4>[ 49.887742] ? ksys_write+0x75/0xe0
> <4>[ 49.887771] ? __x64_sys_write+0x1f/0x30
> <4>[ 49.887787] ? do_syscall_64+0x48/0xa0
> <4>[ 49.887802] ? sysvec_apic_timer_interrupt+0x4d/0x90
> <4>[ 49.887822] ? entry_SYSCALL_64_after_hwframe+0x6e/0xd8
> <4>[ 49.887878] </TASK>