Re: + generic-ipi-fix-the-race-between-generic_smp_call_function_-and-hotplug_cfd.patchadded to -mm tree

From: Xiao Guangrong
Date: Tue Sep 22 2009 - 02:57:53 EST




Suresh Siddha wrote:
> On Sun, 2009-09-20 at 21:04 -0700, Xiao Guangrong wrote:
>> Suresh Siddha wrote:
>>> I am referring to the missing csd_lock_wait() here that you had in the
>>> first version of your patch. Let's say, if cpu X is going offline, we
>>> need to ensure that the smp_call_function() initiated by cpu X (i.e.,
>>> smp_call_function IPI sent to some other cpu's from cpu X) got serviced
>>> before cpu X goes offline. We can't do csd_lock_wait() here, as that
>>> might deadlock (as all the other cpu's are already in stop machine with
>>> interrupts disabled).
>>>
>> It not happen because the preemption is disabled while send IPI request and
>> can't schedule to stop machine path, it also stop cpu down.
>
> Xiao, I am getting confused. I am referring to case '1' mentioned by you
> here http://marc.info/?l=linux-kernel&m=125265516529139&w=2
>

Ah, your meaning is that we can't do csd_lock_wait() in the CPU_DEAD
notification path in my first version patch? like below:

+static int
+hotplug_cfd(struct notifier_block *nfb, unsigned long action, void *hcpu)
+{
...
+
+#ifdef CONFIG_HOTPLUG_CPU
+ case CPU_UP_CANCELED:
+ case CPU_UP_CANCELED_FROZEN:
+
+ case CPU_DEAD:
+ case CPU_DEAD_FROZEN:
+ local_irq_save(flags);
+ __generic_smp_call_function_interrupt(cpu, 0);
+ __generic_smp_call_function_single_interrupt(cpu, 0);
+ local_irq_restore(flags);
+
/* Do you mean we can't do csd_lock_wait() here??? */
+ csd_lock_wait(&cfd->csd);
+ free_cpumask_var(cfd->cpumask);
+ break;
+#endif
+ };
+
+ return NOTIFY_OK;
+}

The CPU_DEAD notification is not sent in stop machine path, you can
see _cpu_down() function in kernel/cpu.c

Suresh, If I misunderstand your words again, could your elaborate it?

My first version patch is not clean and not complete that you point out in
previous mail:
" I am referring to this latest patch only. We are calling the interrupt
handler manually and not doing the callbacks in that context. In future,
we might see other side affects if we miss some of these smp ipi's."

How about the second patch?

Thanks,
Xiao
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/