Re: [PATCH RFC] hist lookups

From: David Miller
Date: Wed Nov 07 2018 - 15:01:57 EST


From: Jiri Olsa <jolsa@xxxxxxxxxx>
Date: Wed, 7 Nov 2018 20:43:44 +0100

> I pushed new version in my perf/fixes branch

Thanks, I'll check it out later today for sure! This is pretty exciting
work.

Just some random thoughts as I've been thinking about this whole
situation a lot lately:

Something to consider might be consolidating all of the event rings
into one. This would force perf to process all events in "system
order", ie. what order they actually occurred in the machine.

Yes, this means more contention for the entities inside the kernel
queueing up the events, however the benefits are enormous.

Right now we go forwards and backwards in time as we move from one
event ring to another, as you know.

However, we have to reconcile with the need we have to separate "high
priority" (ie. cannot really lose) events like fork, mmap2, etc. with
"low priority" ones such as IP samples.

Perhaps another way to think about this is to go to the one huge mmap
ring model, and do the prioritization internally in perf.

Actually, this opens up tons of possibilities in my mind.

Perf can queue to an internal high priority queue for fork and mmap2
events, and never drop them. Whilst at the same time queueing low
priority events like IP samples into a low priority queue and dropping
with whatever policy it wants when overloaded (f.e. drop older events
before newer ones).

I do not like the overwrite ring buffer mode that was implemented
because it enforeces an entire set of policy decisions upon the user.
Either the model works for you (which it currently does not for perf)
or you can't use it at all.

If the issue is that newer events are more interesting than old ones,
that is entirely perf's businness. And it can implement this policy
%100 internally to itself. No kernel changes were ever needed to do
this, as explained above.

Please, let's abandon this whole overwrite mode of the ring buffer.
The old one works perfectly fine, we just have to use it properly.
We should never have to shut off kernel side event queueing just
because we are processing the event ring on the user side.

Thanks.