Re: Did we really need to clear the IF flag at prepare_singlestep() of x86 kprobes?

From: Dongdong Deng
Date: Thu Jan 14 2010 - 01:45:54 EST


On Wed, Jan 13, 2010 at 2:18 PM, Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote:
> Dongdong Deng wrote:
>> Hi Kprobe experts,
>>
>> I have a doubt about the handling "X86_EFLAGS_IF" at prepare_singlestep(),
>> Could you give me some suggestions?
>>
>>
>> arch/x86/kernel/kprobes.c:
>> 406 static void __kprobes prepare_singlestep(struct kprobe *p, struct
>> pt_regs *regs)
>> 407 {
>> 408 Â Âclear_btf();
>> 409 Â Âregs->flags |= X86_EFLAGS_TF;
>> 410 Â Âregs->flags &= ~X86_EFLAGS_IF;
>> Â ...
>> }
>>
>>
>> for 410 line: Kprobe is intend to disable interrupt during the single step.
>>
>> I think it is enough that just setting X86_EFLAGS_TF as following reasons.
>>
>>
>> ******************
>> Reason 1: "debug trap" was initalized as an interrupt gate
>>
>> arch/x86/kernel/traps.c:892: set_intr_gate_ist(1, &debug, DEBUG_STACK);
>>
>> The "debug trap" was initalized as an interrupt gate, thereby during the
>> hanld function of debug exceptions, the X86_EFLAGS_IF have been
>> cleared automatically.
>>
>>
>> ******************
>> Reason 2: the priority among debug exceptions and interrupts
>>
>> Intel 64 and IA-32 Architectures Software Developerâs Manual Volume
>> 3A, page 5-11:
>>
>> If more than one exception or interrupt is pending at an instruction
>> boundary, the
>> processor services them in a predictable order. Table 5-2 shows the
>> priority among
>> classes of exception and interrupt sources.
>> Â Â Â Â Â Table 5-2. Priority Among Simultaneous Exceptions and Interrupts
>> Priority    Description
>> 1 (Highest) Â ÂHardware Reset and Machine Checks
>> Â Â Â Â Â Â Â Â- RESET
>> Â Â Â Â Â Â Â Â- Machine Check
>> 2 Â Â Â Â Â Â ÂTrap on Task Switch
>> Â Â Â Â Â Â Â Â- T flag in TSS is set
>> 3 Â Â Â Â Â Â ÂExternal Hardware Interventions
>> Â Â Â Â Â Â Â Â- FLUSH
>> Â Â Â Â Â Â Â Â- STOPCLK
>> Â Â Â Â Â Â Â Â- SMI
>> Â Â Â Â Â Â Â Â- INIT
>> 4 Â Â Â Â Â Â ÂTraps on the Previous Instruction
>> Â Â Â Â Â Â Â Â- Breakpoints
>> Â Â Â Â Â Â Â Â- Debug Trap Exceptions (TF flag set or data/I-O breakpoint)
>> 5 Â Â Â Â Â Â Nonmaskable Interrupts (NMI)
>> 6 Â Â Â Â Â Â Maskable Hardware Interrupts
>>
>>
>> From the table we could see debug exceptions lies in priority 4 and
>> external interrupt lies
>> in priority 6.
>>
>> Thereby the processor will handle Debug Trap Exceptions first, then
>> handle external interrupt.
>
> Hi Dongdong,
>
> Hmm, can that be applied on other x86 compat cpus too?
> And, when is the debug trap exception actually happened?

> 1: int3 ->
> 2: Â-> pre_kprobe_handler
> 3: Â-> prepare_singlestep
> 4: <- iret
> 5: execute instruction
> 6: debug trap ->
> 7: -> post_kprobe_handler
> ...
>
> If we have an interrupt before step4, does that interrupt
> really executed *after* step5? or step4?


Hi Masami,

Thanks for your detail explain, it is the key of my question. :)

I write a test case to proving it.

The test case required run on uniprocessor systems,

My machine is intel Xeon-Dual, so I disable the SMP support when
building kernel.


The test case works.

1: delay a long time during INT3 handler of kprobes.

2: add a printk at the net driver interrupt handler.(I am using
e10000e net-card)

3: startup system

4: using other PC to ping current machine all the while, thereby it
could generate net-card interrupt during INT3.

5: insmod the samples/kprobes/kprobe_example.ko .

6: using the following script to trigger kprobe.

#!/bin/bash
a=0 ; while [ $a != 8000 ]; do(ls ./); a=$(( $a + 1 )); done


Test output result:

# cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Xeon(R) CPU 5138 @ 2.13GHz
stepping : 11
cpu MHz : 2133.324
cache size : 4096 KB
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe
syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl pni
monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr dca lahf_lm
bogomips : 4266.64
clflush size : 64
cache_alignment : 64
address sizes : 38 bits physical, 48 bits virtual
power management:

# insmod kprobe_example.ko
Planted kprobe at ffffffff8022df60

# /bin/bash 1.sh
pre_handler: p->addr = 0xffffffff8022df60, ip = ffffffff8022df61, flags = 0x246
prepare_singlestep didn't clear X86_EFLAGS_IF
Got a e1000 intrrupt during kprobe single step!!!!
post_handler: p->addr = 0xffffffff8022df60, ip = ffffffff8022df62, flags = 0x246
pre_handler: p->addr = 0xffffffff8022df60, ip = ffffffff8022df61, flags = 0x246
prepare_singlestep didn't clear X86_EFLAGS_IF
Got a e1000 intrrupt during kprobe single step!!!!
post_handler: p->addr = 0xffffffff8022df60, ip = ffffffff8022df62, flags = 0x246
1.sh kprobe_example.ko
pre_handler: p->addr = 0xffffffff8022df60, ip = ffffffff8022df61, flags = 0x246
prepare_singlestep didn't clear X86_EFLAGS_IF
Got a e1000 intrrupt during kprobe single step!!!!
post_handler: p->addr = 0xffffffff8022df60, ip = ffffffff8022df62, flags = 0x246
1.sh kprobe_example.ko
pre_handler: p->addr = 0xffffffff8022df60, ip = ffffffff8022df61, flags = 0x246
prepare_singlestep didn't clear X86_EFLAGS_IF
Got a e1000 intrrupt during kprobe single step!!!!
post_handler: p->addr = 0xffffffff8022df60, ip = ffffffff8022df62, flags = 0x246
1.sh kprobe_example.ko
pre_handler: p->addr = 0xffffffff8022df60, ip = ffffffff8022df61, flags = 0x246
prepare_singlestep didn't clear X86_EFLAGS_IF
Got a e1000 intrrupt during kprobe single step!!!!
post_handler: p->addr = 0xffffffff8022df60, ip = ffffffff8022df62, flags = 0x246
1.sh kprobe_example.ko


>From the result of test cause, the processor really tries to execute
interrupt right after step4.


>
> If the processor really tries to execute interrupt
> right after step5, your logic seems correct, but if it
> is done right after step4, clearing IF seems correct.

But I couldn't make sure that this test case is suitable or not.
If the test case is OK, my logic seems wrong.


Thank you very much,
Dongdong


>
> Thank you,
>
> --
> Masami Hiramatsu
>
> Software Engineer
> Hitachi Computer Products (America), Inc.
> Software Solutions Division
>
> e-mail: mhiramat@xxxxxxxxxx
>
>
diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c
index a9541cb..d81f549 100644
--- a/arch/x86/kernel/kprobes.c
+++ b/arch/x86/kernel/kprobes.c
@@ -55,6 +55,7 @@
#include <asm/uaccess.h>
#include <asm/alternative.h>

+int dbug_kprob_pk;
void jprobe_return_end(void);

DEFINE_PER_CPU(struct kprobe *, current_kprobe) = NULL;
@@ -421,7 +422,9 @@ static void __kprobes prepare_singlestep(struct kprobe *p, struct pt_regs *regs)
{
clear_btf();
regs->flags |= X86_EFLAGS_TF;
- regs->flags &= ~X86_EFLAGS_IF;
+
+ printk(KERN_ERR "prepare_singlestep didn't clear X86_EFLAGS_IF\n");
+ /* regs->flags &= ~X86_EFLAGS_IF; */
/* single step inline if the instruction is an int3 */
if (p->opcode == BREAKPOINT_INSTRUCTION)
regs->ip = (unsigned long)p->addr;
@@ -449,6 +452,7 @@ static void __kprobes setup_singlestep(struct kprobe *p, struct pt_regs *regs,
reset_current_kprobe();
regs->ip = (unsigned long)p->ainsn.insn;
preempt_enable_no_resched();
+ dbug_kprob_pk = 0;
return;
}
#endif
@@ -475,6 +479,7 @@ static int __kprobes reenter_kprobe(struct kprobe *p, struct pt_regs *regs,
regs->ip = (unsigned long)p->addr;
reset_current_kprobe();
preempt_enable_no_resched();
+ dbug_kprob_pk = 0;
break;
#endif
case KPROBE_HIT_ACTIVE:
@@ -531,6 +536,7 @@ static int __kprobes kprobe_handler(struct pt_regs *regs)
return 1;
}

+ dbug_kprob_pk = 1;
/*
* We don't want to be preempted for the entire
* duration of kprobe processing. We conditionally
@@ -539,6 +545,10 @@ static int __kprobes kprobe_handler(struct pt_regs *regs)
*/
preempt_disable();

+ int i;
+ for (i = 0; i< 100; i++)
+ udelay(8000);
+
kcb = get_kprobe_ctlblk();
p = get_kprobe(addr);

@@ -571,6 +581,7 @@ static int __kprobes kprobe_handler(struct pt_regs *regs)
} /* else: not a kprobe fault; let the kernel handle it */

preempt_enable_no_resched();
+ dbug_kprob_pk = 0;
return 0;
}

@@ -870,6 +881,7 @@ static int __kprobes post_kprobe_handler(struct pt_regs *regs)
reset_current_kprobe();
out:
preempt_enable_no_resched();
+ dbug_kprob_pk = 0;

/*
* if somebody else is singlestepping across a probe point, flags
@@ -904,6 +916,7 @@ int __kprobes kprobe_fault_handler(struct pt_regs *regs, int trapnr)
else
reset_current_kprobe();
preempt_enable_no_resched();
+ dbug_kprob_pk = 0;
break;
case KPROBE_HIT_ACTIVE:
case KPROBE_HIT_SSDONE:
@@ -942,6 +955,7 @@ int __kprobes kprobe_fault_handler(struct pt_regs *regs, int trapnr)
return 0;
}

+
/*
* Wrapper routine for handling exceptions.
*/
@@ -960,8 +974,10 @@ int __kprobes kprobe_exceptions_notify(struct notifier_block *self,
ret = NOTIFY_STOP;
break;
case DIE_DEBUG:
- if (post_kprobe_handler(args->regs))
+ if (post_kprobe_handler(args->regs)) {
+ dbug_kprob_pk = 0;
ret = NOTIFY_STOP;
+ }
break;
case DIE_GPF:
/*
diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c
index 18a12c4..e67104c 100644
--- a/drivers/net/e1000e/netdev.c
+++ b/drivers/net/e1000e/netdev.c
@@ -1204,6 +1204,10 @@ static irqreturn_t e1000_intr(int irq, void *data)
struct e1000_hw *hw = &adapter->hw;
u32 rctl, icr = er32(ICR);

+ extern int dbug_kprob_pk;
+ if (dbug_kprob_pk)
+ printk(KERN_ERR "Got a e1000 intrrupt during kprobe single step!!!!\n");
+
if (!icr)
return IRQ_NONE; /* Not our interrupt */