Re: [PATCH] perf/core: introduce context per CPU event list

From: Peter Zijlstra
Date: Thu Nov 10 2016 - 07:58:19 EST


On Thu, Nov 10, 2016 at 12:26:18PM +0000, Mark Rutland wrote:
> On Thu, Nov 10, 2016 at 01:12:53PM +0100, Peter Zijlstra wrote:

> > Ah, so the tree would in fact only contain 'INACTIVE' events :-)
>
> Ah. :)
>
> That explains some of the magic, but...
>
> > That is, when no events are on the hardware, all events (if there are
> > any) are INACTIVE.
> >
> > Then on sched-in, we find the relevant subtree, and linearly try and
> > program all events from that subtree onto the PMU. Once adding an event
> > fails programming, we stop (like we do now).
> >
> > These programmed events transition from INACTIVE to ACTIVE, and we take
> > them out of the tree.
> >
> > Then on sched-out, we remove all events from the hardware, increase the
> > events their runtime value by however long they were ACTIVE, flip them
> > to INACTIVE and stuff them back in the tree.
>
> ... per the above, won't the tree also contain 'OFF' events (and
> 'ERROR', etc)?
>
> ... or do we keep them somewhere else (another list or sub-tree)?

I don't think those need be tracked at all, they're immaterial for
actual scheduling. Once we ioctl() them back to life we can insert them
into the tree.

> If not, we still have to walk all of those in perf_iterate_ctx().
>
> > (I'm can't quite recall if we can easily find ACTIVE events from a PMU,
> > but if not, we can easily track those on a separate list).
>
> I think we just iterate the perf_event_context::event list and look at
> the state. Regardless, adding lists is fairly simple.
>
> Thanks,
> Mark.