Re: [PATCH] perf/x86/intel/rapl: Rename rapl_cpu_prepare() to rapl_cpu_starting()

From: Thomas Gleixner
Date: Tue Jan 24 2017 - 14:54:31 EST


On Tue, 24 Jan 2017, Yasuaki Ishimatsu wrote:
> rapl_cpu_prepare() must be called after logical package id of CPU
> is set by topology_update_package_map().
>
> But when onlining hot-added CPU, rapl_cpu_prepare() is called before
> setting logical package id of the hot-added CPU. So cpu_to_rapl_pmu()
> in rapl_cpu_prepare() finds a rapl_pmu of wrong logical package id and
> rapl_cpu_prepare() initializes the wrong rapl_pmu.
>
> After that logical package id of the hot-added CPU is set by
> topology_update_package_map(). But rapl_cpu_prepare() does
> not initialize pmu of the logical package id of the hot-added CPU.
> So when calling rapl_cpu_online(), cpu_to_rapl_pmu() returns NULL and
> the following NULL pointer dereference occurs.
>
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> IP: rapl_cpu_online+0x8d/0xb0
> <snip>
> Call Trace:
> ? rapl_cpu_offline+0xc0/0xc0
> cpuhp_invoke_callback+0x8d/0x3f0
> cpuhp_up_callbacks+0x37/0xb0
> cpuhp_thread_fun+0xc9/0xe0
> smpboot_thread_fn+0x110/0x160
> kthread+0x101/0x140
> ? sort_range+0x30/0x30
> ? kthread_park+0x90/0x90
> ret_from_fork+0x25/0x30
>
> The patch renames rapl_cpu_prepare() to rapl_cpu_starting() and changes
> the position of cpuhp_state so that rapl_cpu_starting() is called
> after topology_update_package_map().

Does not work. You cannot call that callback in the starting context. It
does allocations. This needs be fixed in a different way. I'll have a look
tomorrow.

Thanks,

tglx