Re: [PATCH 5.10] kprobes/x86: Fix kprobe debug exception handling logic

From: Greg KH
Date: Mon Jul 03 2023 - 14:34:56 EST


On Sat, Jul 01, 2023 at 04:43:46PM +0800, Li Huafei wrote:
>
>
> On 2023/6/30 13:21, Greg KH wrote:
> > On Fri, Jun 30, 2023 at 10:08:45AM +0800, Li Huafei wrote:
> >> We get the following crash caused by a null pointer access:
> >>
> >> BUG: kernel NULL pointer dereference, address: 0000000000000000
> >> ...
> >> RIP: 0010:resume_execution+0x35/0x190
> >> ...
> >> Call Trace:
> >> <#DB>
> >> kprobe_debug_handler+0x41/0xd0
> >> exc_debug+0xe5/0x1b0
> >> asm_exc_debug+0x19/0x30
> >> RIP: 0010:copy_from_kernel_nofault.part.0+0x55/0xc0
> >> ...
> >> </#DB>
> >> process_fetch_insn+0xfb/0x720
> >> kprobe_trace_func+0x199/0x2c0
> >> ? kernel_clone+0x5/0x2f0
> >> kprobe_dispatcher+0x3d/0x60
> >> aggr_pre_handler+0x40/0x80
> >> ? kernel_clone+0x1/0x2f0
> >> kprobe_ftrace_handler+0x82/0xf0
> >> ? __se_sys_clone+0x65/0x90
> >> ftrace_ops_assist_func+0x86/0x110
> >> ? rcu_nocb_try_bypass+0x1f3/0x370
> >> 0xffffffffc07e60c8
> >> ? kernel_clone+0x1/0x2f0
> >> kernel_clone+0x5/0x2f0
> >>
> >> The analysis reveals that kprobe and hardware breakpoints conflict in
> >> the use of debug exceptions.
> >>
> >> If we set a hardware breakpoint on a memory address and also have a
> >> kprobe event to fetch the memory at this address. Then when kprobe
> >> triggers, it goes to read the memory and triggers hardware breakpoint
> >> monitoring. This time, since kprobe handles debug exceptions earlier
> >> than hardware breakpoints, it will cause kprobe to incorrectly assume
> >> that the exception is a kprobe trigger.
> >>
> >> Notice that after the mainline commit 6256e668b7af ("x86/kprobes: Use
> >> int3 instead of debug trap for single-step"), kprobe no longer uses
> >> debug trap, avoiding the conflict with hardware breakpoints here. This
> >> commit is to remove the IRET that returns to kernel, not to fix the
> >> problem we have here. Also there are a bunch of merge conflicts when
> >> trying to apply this commit to older kernels, so fixing it directly in
> >> older kernels is probably a better option.
> >
> > What is the list of commits that it would take to resolve this in these
> > kernels? We would almost always prefer to do that instead of taking
> > changes that are not upstream.
>
> I have sorted out that for 5.10 there are 9 patches that need to be
> backported:
>
> #9 8924779df820 ("x86/kprobes: Fix JNG/JNLE emulation")
> #8 dec8784c9088 ("x86/kprobes: Update kcb status flag after singlestepping")
> #7 2304d14db659 ("x86/kprobes: Move 'inline' to the beginning of the kprobe_is_ss() declaration")
> #6 2f706e0e5e26 ("x86/kprobes: Fix to identify indirect jmp and others using range case")
> #5 6256e668b7af ("x86/kprobes: Use int3 instead of debug trap for single-step")
> #4 a194acd316f9 ("x86/kprobes: Identify far indirect JMP correctly")
> #3 d60ad3d46f1d ("x86/kprobes: Retrieve correct opcode for group instruction")
> #2 abd82e533d88 ("x86/kprobes: Do not decode opcode in resume_execution()")
> #1 e689b300c99c ("kprobes/x86: Fix fall-through warnings for Clang e689b300c99c")
>
> The main one we need to backport is patch 5, patche 1-6 are pre-patches,
> and patche 6-9 are fix patches for patch 5. The major modifications are
> patch 2 and patch 4. Patch 2 optimizes resume_execution() to avoid
> repeated instruction decoding, and patch 5 uses int3 instead of debug
> trap, and as Masami said in the commit message this patch will change
> some behavior of kprobe, but it has almost no effect on the actual
> usage.
>
> I'm not sure backport these patches are acceptable, do I need to send
> them out for review?

Yes, please make up the patch series for these, that's not all that bad,
and looks like it is more "correct" than just your one-off patch.

thanks,

greg k-h