Re: [PATCH] powerpc: Only obtain cpu_hotplug_lock if called by rtasd

From: Thiago Jung Bauermann
Date: Wed Jun 21 2017 - 21:15:10 EST



Michael Ellerman <mpe@xxxxxxxxxxxxxx> writes:
> Thiago Jung Bauermann <bauerman@xxxxxxxxxxxxxxxxxx> writes:
>
>> Calling arch_update_cpu_topology from a CPU hotplug state machine callback
>> hits a deadlock because the function tries to get a read lock on
>> cpu_hotplug_lock while the state machine still holds a write lock on it.
>>
>> Since all callers of arch_update_cpu_topology except rtasd already hold
>> cpu_hotplug_lock, this patch changes the function to use
>> stop_machine_cpuslocked and creates a separate function for rtasd which
>> still tries to obtain the lock.
>>
>> Michael Bringmann investigated the bug and provided a detailed analysis
>> of the deadlock on this previous RFC for an alternate solution:
>>
>> https://patchwork.ozlabs.org/patch/771293/
>
> Do we know when this broke? Or has it never worked?

It's been broken since at least v4.4, I think. I don't know about
earlier versions.

> Should it go to stable? (can't in its current form AFAICS)

It's not hard to backport both this patch and commit fe5595c07400
("stop_machine: Provide stop_machine_cpuslocked()") from branch
smp/hotplug in tip.git for stable.

Since rtasd only started calling arch_update_cpu_topology since v4.11,
for earlier versions this patch can be simplified to making that
function call stop_machine_cpuslocked unconditionally instead of
defining a separate function.

>> Signed-off-by: Thiago Jung Bauermann <bauerman@xxxxxxxxxxxxxxxxxx>
>> ---
>>
>> Notes:
>> This patch applies on tip/smp/hotplug, it should probably be carried there.
>
> stop_machine_cpuslocked() doesn't exist in mainline so I think it has to
> be carried there right?

Yes. I said "probably" because I don't know if you want to wait
until that branch is merged so that you can carry this patch in your
tree.

--
Thiago Jung Bauermann
IBM Linux Technology Center