Re: Dealing with the NMI mess

From: Linus Torvalds
Date: Fri Jul 24 2015 - 14:29:34 EST


On Fri, Jul 24, 2015 at 8:30 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Fri, Jul 24, 2015 at 05:26:37PM +0200, Willy Tarreau wrote:
>> >
>> > The point is, if we trigger a #DB on an instruction breakpoint
>> > while !IF, then we simply disable that breakpoint and do the RET.
>>
>> Yes but the breakpoint remains disabled then. Or I'm missing
>> something.
>
> http://marc.info/?l=linux-kernel&m=143773601130974
>
> We re-enable before going back to userspace.

Actually, Andy had a good argument that we don't even need this.

We just don't ever need to disable data breakpoints. Even if we end up doing

cli();
copy_from_user_inatomic();

that actually works fine. If there are data breakpoints, we will have

(a) things will run slow as hell anyway. Intel CPU's slow down to a
relative crawl.

(b) let's say we have a data breakpoint on the data we're reading above

(c) we take a #DB fault after the instruction has completed, so we
make forward progress even if we return with RF clear

(d) even if the data breakpoint is unaligned and triggers multiple
times, it's going to be a "small number" of multiple times. And see
(a). This never happens in practice, and the much bigger slowdown is
how data breakpoints tend to slow things down in general.

(e) yes, the string instructions may hit the data breakpoint multilpe
times for the "same" instruction, but the forward progress part is
still true even for the string instructions. In fact, it's actually
likely <i>more</i> true for string instructions, because they are
documented to be less exact, and may trigger the data watchpoint only
after a "group of iterations".

so I think we just leave data breakpoint alone. The only debug
conditions that are *faults* rather than traps are the instruction
breakpoints, and we can detect and disable those by just saying "oh,
we're in kernel mode".

So in the #DB handler, we would basically only clear instruction
breakpoints, and only when they trigger. If we have a data breakpoint
that triggers (even in kernel mode, and with interrupts disabled), let
it trigger and return with "ret" anyway. No biggie.

(Ok, so the "General detect fault" is also a fault rather than a trap,
but that's the "write to debug registers when it's disabled" thing,
very different)

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/