Re: [PATCH 3/3 -mm] generic-ipi: fix the race between generic_smp_call_function_*()and hotplug_cfd()

From: Xiao Guangrong
Date: Wed Jul 29 2009 - 23:31:56 EST




Andrew Morton wrote:
> On Wed, 29 Jul 2009 15:57:51 +0800
> Xiao Guangrong <xiaoguangrong@xxxxxxxxxxxxxx> wrote:
>
>> It have race between generic_smp_call_function_*() and hotplug_cfd()
>> in many cases, see below examples:
>>
>> 1: hotplug_cfd() can free cfd->cpumask, the system will crash if the
>> cpu's cfd still in the call_function list:
>>
>>
>> CPU A: CPU B
>>
>> smp_call_function_many() ......
>> cpu_down() ......
>> hotplug_cfd() -> ......
>> free_cpumask_var(cfd->cpumask) (receive function IPI interrupte)
>> /* read cfd->cpumask */
>> generic_smp_call_function_interrupt() ->
>> cpumask_test_and_clear_cpu(cpu, data->cpumask)
>>
>> CRASH!!!
>>
>> 2: It's not handle call_function list when cpu down, It's will lead to
>> dead-wait if other path is waiting this cpu to execute function
>>
>> CPU A: CPU B
>>
>> smp_call_function_many(wait=0)
>> ...... CPU B down
>> smp_call_function_many() --> (cpu down before recevie function
>> csd_lock(&data->csd); IPI interrupte)
>>
>> DEAD-WAIT!!!!
>>
>> So, CPU A will dead-wait in csd_lock(), the same as
>> smp_call_function_single()
>>
>> Signed-off-by: Xiao Guangrong <xiaoguangrong@xxxxxxxxxxxxxx>
>> ---
>> kernel/smp.c | 140 ++++++++++++++++++++++++++++++++-------------------------
>> 1 files changed, 79 insertions(+), 61 deletions(-)
>>
>
> It was unfortunate that this patch moved a screenful of code around and
> changed that code at the same time - it makes it hard to see what the
> functional change was.
>
> So I split this patch into two. The first patch simply moves
> hotplug_cfd() to the end of the file and the second makes the
> functional changes. The second patch is below, for easier review.
>
> Do we think that this patch should be merged into 2.6.31? 2.6.30.x?
>

This bug is conceal from v2.6.26 when kernel/smp.c created and be
found by my review, no one is bothered by it and sends us a bug
report, besides, this patch can't be applied to <= 2.6.30 cleanly,
so I think we can just fix it for .31

Thanks,
Xiao
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/