Re: [PATCH] perf/core: generate overflow signal when samples are dropped (WAS: Re: [REGRESSION] perf/core: PMU interrupts dropped if we entered the kernel in the "skid" region)

From: Mark Rutland
Date: Wed Jun 28 2017 - 09:08:53 EST


On Wed, Jun 28, 2017 at 08:40:30AM -0400, Vince Weaver wrote:
> On Wed, 28 Jun 2017, Mark Rutland wrote:
>
> > On Wed, Jun 28, 2017 at 11:12:48AM +0100, Mark Rutland wrote:
>
> > Instead of bailing out early in perf_event_overflow, we can bail prior
> > to performing the actual sampling in __perf_event_output(). This avoids
> > the information leak, but preserves the generation of the signal.
> >
> > Since we don't place any sample data into the ring buffer, the signal is
> > arguably spurious. However, a userspace ringbuffer consumer can already
> > consume data prior to taking the associated signals, and therefore must
> > handle spurious signals to operate correctly. Thus, this signal
> > shouldn't be harmful.
>
> this could still break some of my perf_event validation tests.
>
> Ones that set up a sampling event for every 1M instructions, run for 100M
> instructions, and expect there to be 100 samples received.

Is that test reliable today?

I'd expect that at least on ARM it's not, given that events can be
counted imprecisely, and mode filters can be applied imprecisely. So you
might get fewer (or more) samples. I'd imagine similar is true on other
archtiectures.

If sampling took long enough, the existing ratelimiting could come into
effect, too.

Surely that already has some error margin?

> If we're so worried about info leakage, can't we just zero-out the problem
> address (or randomize the kernel address) rather than just pretending the
> interrupt didn't happen?

Making up zeroed or randomized data is going to confuse users. I can't
imagine that real users are going to want bogus samples that they have
to identify (somehow) in order to skip when processing the data.

I can see merit in signalling "lost" samples to userspace, so long as
they're easily distinguished from real samples.

One option is to fake up a sample using the user regs regardless, but
that's both fragile and surprising in other cases.

Thanks,
Mark.