Re: [Syzkaller & bisect] There is "__perf_event_overflow" WARNING in v6.1-rc5 kernel in guest

From: Marco Elver
Date: Thu Nov 24 2022 - 03:31:57 EST


On Wed, 23 Nov 2022 at 16:05, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Sat, Nov 19, 2022 at 10:45:54AM +0800, Pengfei Xu wrote:
>
> > The result shows that your additional patch fixed this issue!
> > If possible, could you add Reported-and-tested-by tag from me.
>
> After talking with Marco for a bit the patch now looks like the below.
> I've tentatively retained your tested-by, except of course, you haven't.
>
> If I could bother you once more to test the branch:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git perf/urgent
>
> ---
> Subject: perf: Consider OS filter fail
> From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Date: Sat, 19 Nov 2022 10:45:54 +0800
>
> Some PMUs (notably the traditional hardware kind) have boundary issues
> with the OS filter. Specifically, it is possible for
> perf_event_attr::exclude_kernel=1 events to trigger in-kernel due to
> SKID or errata.
>
> This can upset the sigtrap logic some and trigger the WARN.
>
> However, if this invalid sample is the first we must not loose the
> SIGTRAP, OTOH if it is the second, it must not override the
> pending_addr with an invalid one.
>
> Fixes: ca6c21327c6a ("perf: Fix missing SIGTRAPs")
> Reported-by: Pengfei Xu <pengfei.xu@xxxxxxxxx>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> Tested-by: Pengfei Xu <pengfei.xu@xxxxxxxxx>
> Link: https://lkml.kernel.org/r/Y3hDYiXwRnJr8RYG@xxxxxxxxxxxxxxxx

Thanks, FWIW

Reviewed-by: Marco Elver <elver@xxxxxxxxxx>

One thing I wondered was, if the event fired in the kernel due to
skid, is the addr always some kernel address, or does this also depend
on the type of PMU? In any case, we don't even want to risk leaking
kernel addresses this way, so this looks sane.

> ---
> kernel/events/core.c | 24 ++++++++++++++++++++++--
> 1 file changed, 22 insertions(+), 2 deletions(-)
>
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -9273,6 +9273,19 @@ int perf_event_account_interrupt(struct
> return __perf_event_account_interrupt(event, 1);
> }
>
> +static inline bool sample_is_allowed(struct perf_event *event, struct pt_regs *regs)
> +{
> + /*
> + * Due to interrupt latency (AKA "skid"), we may enter the
> + * kernel before taking an overflow, even if the PMU is only
> + * counting user events.
> + */
> + if (event->attr.exclude_kernel && !user_mode(regs))
> + return false;
> +
> + return true;
> +}
> +
> /*
> * Generic event overflow handling, sampling.
> */
> @@ -9306,6 +9319,13 @@ static int __perf_event_overflow(struct
> }
>
> if (event->attr.sigtrap) {
> + /*
> + * The desired behaviour of sigtrap vs invalid samples is a bit
> + * tricky; on the one hand, one should not loose the SIGTRAP if
> + * it is the first event, on the other hand, we should also not
> + * trigger the WARN or override the data address.
> + */
> + bool valid_sample = sample_is_allowed(event, regs);
> unsigned int pending_id = 1;
>
> if (regs)
> @@ -9313,7 +9333,7 @@ static int __perf_event_overflow(struct
> if (!event->pending_sigtrap) {
> event->pending_sigtrap = pending_id;
> local_inc(&event->ctx->nr_pending);
> - } else if (event->attr.exclude_kernel) {
> + } else if (event->attr.exclude_kernel && valid_sample) {
> /*
> * Should not be able to return to user space without
> * consuming pending_sigtrap; with exceptions:
> @@ -9330,7 +9350,7 @@ static int __perf_event_overflow(struct
> }
>
> event->pending_addr = 0;
> - if (data->sample_flags & PERF_SAMPLE_ADDR)
> + if (valid_sample && (data->sample_flags & PERF_SAMPLE_ADDR))
> event->pending_addr = data->addr;
> irq_work_queue(&event->pending_irq);
> }