Re: [PATCH] perf: Add check_period pmu callback

From: Peter Zijlstra
Date: Mon Feb 11 2019 - 05:11:48 EST


On Mon, Feb 04, 2019 at 01:35:32PM +0100, Jiri Olsa wrote:
> Vince (and later on Ravi) reported crash in BTS code during
> fuzzing with following backtrace:
>
> general protection fault: 0000 [#1] SMP PTI
> ...
> RIP: 0010:perf_prepare_sample+0x8f/0x510
> ...
> Call Trace:
> <IRQ>
> ? intel_pmu_drain_bts_buffer+0x194/0x230
> intel_pmu_drain_bts_buffer+0x160/0x230
> ? tick_nohz_irq_exit+0x31/0x40
> ? smp_call_function_single_interrupt+0x48/0xe0
> ? call_function_single_interrupt+0xf/0x20
> ? call_function_single_interrupt+0xa/0x20
> ? x86_schedule_events+0x1a0/0x2f0
> ? x86_pmu_commit_txn+0xb4/0x100
> ? find_busiest_group+0x47/0x5d0
> ? perf_event_set_state.part.42+0x12/0x50
> ? perf_mux_hrtimer_restart+0x40/0xb0
> intel_pmu_disable_event+0xae/0x100
> ? intel_pmu_disable_event+0xae/0x100
> x86_pmu_stop+0x7a/0xb0
> x86_pmu_del+0x57/0x120
> event_sched_out.isra.101+0x83/0x180
> group_sched_out.part.103+0x57/0xe0
> ctx_sched_out+0x188/0x240
> ctx_resched+0xa8/0xd0
> __perf_event_enable+0x193/0x1e0
> event_function+0x8e/0xc0
> remote_function+0x41/0x50
> flush_smp_call_function_queue+0x68/0x100
> generic_smp_call_function_single_interrupt+0x13/0x30
> smp_call_function_single_interrupt+0x3e/0xe0
> call_function_single_interrupt+0xf/0x20
> </IRQ>
>
> The reason is that while event init code does several checks
> for BTS events and prevents several unwanted config bits for
> BTS event (like precise_ip), the PERF_EVENT_IOC_PERIOD allows
> to create BTS event without those checks being done.
>
> Following sequence will cause the crash:
> - create 'almost' BTS event with precise_ip and callchains,
> (perf command line -e option equiv.):
>
> -e cpu/branch-instructions/up -c 2 -g
>
> - change the period of that event to '1', which will turn
> it to BTS event, with precise_ip and callchains
>
> That will immediately cause crash in perf_prepare_sample
> function because precise_ip events are expected to come
> in with callchain data initialized, but that's not the
> case for intel_pmu_drain_bts_buffer caller.
>
> Adding a check_period callback to be called before the period
> is changed via PERF_EVENT_IOC_PERIOD. It will deny the change
> if the event would become BTS. Plus adding also the limit_period
> check as well.
>
> Cc: Vince Weaver <vincent.weaver@xxxxxxxxx>
> Cc: Ravi Bangoria <ravi.bangoria@xxxxxxxxxxxxx>
> Reported-by: Vince Weaver <vincent.weaver@xxxxxxxxx>
> Signed-off-by: Jiri Olsa <jolsa@xxxxxxxxxx>

Thanks Jiri!