Re: [PATCHv11 2.6.36-rc2-tip 3/15] 3: uprobes: Slot allocationfor Execution out of line(XOL)

From: Peter Zijlstra
Date: Fri Sep 03 2010 - 03:27:58 EST


On Thu, 2010-09-02 at 23:17 +0530, Srikar Dronamraju wrote:
> > > Current slot allocation mechanism:
> > > 1. Allocate one dedicated slot per user breakpoint. Each slot is big
> > > enuf to accomodate the biggest instruction for that architecture. (16
> > > bytes for x86).
> > > 2. We currently allocate only one page for slots. Hence the number of
> > > slots is limited to active breakpoint hits on that process.
> > > 3. Bitmap to track used slots.
> >
> > An alternative method would be to have 1 slot per cpu, and manage the
> > slot content using preemption notifiers. That gives you a fixed number
> > of slots and an unlimited number of probe points.
> >
> > If the preemption happens to be a migration you need to rewrite the
> > userspace IP to point to the new slot -- if indeed the task was inside
> > one when it got preempted -- but that all should be doable.
> >
>
> Certainly doable but it has its share of drawbacks.
> 1. On every probe hit we have to copy the instruction into the
> slot, so there is a performance penalty.

Yeah, although I imagine its nearly free since you need to pay the
cache-miss anyway.

> 2 This might complicate booster probe, because the jump
> instruction that follows the original instruction now actually have to
> coded every time.

Why can't you keep the whole replacement sequence in-tact? Simply copy
it out into the slot each time.

> 3. Yes migration is an issue esp
> - if a thread of the same process that hit a breakpoint is scheduled into the same cpu and that newly scheduled thread hits a breakpoint.
> - Something similar can happen if a multithreaded process runs on a
> uniprocessor machine.

-ENOPARSE ?!

> 4. I dont see a need for clearing slots after post processing, but if
> we need to clear we then are adding more penalties because not only are
> we clearing the slots but the post processing then cant happen in
> interrupt context.

post-processing? you mean the probe handler? Why couldn't that be done
from interrupt context?

> 5. I think we are covered on the cpu hotplug too, (i.e not sure if we have
> to make uprobes cpu hot plug aware.).

Not if you use a slot per cpu and use preemption notifiers, the
preemption notifiers will migrate the slots around.

> 6. We would still be allocating a page for the slots. Unless we want
> to expand to more slots than available in one page, I dont see the
> disadvantages with the current approach.

The current approach limits the number of probes to what fits in a page.
The slot per cpu approach will have no such limit.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/