CPU Hotplug broken -mm5 onwards

From: Srivatsa Vaddagiri
Date: Sun Apr 18 2004 - 12:07:17 EST


Hi,
I found that I can't boot with CONFIG_HOTPLUG_CPU defined in both
mm5 and mm6. Debugging this revealed it to be because exec path can now require
cpu hotplug sem (sched_migrate_task) and this has lead to a deadlock between
flush_workqueue and __call_usermodehelper.

flush_workqueue takes cpu hotplug sem and blocks until workqueue is flushed.
__call_usermodehelper, one of the queued work function, blocks because it
also needs cpu hotplug sem during exec. As of result of this, exec does not
progress and system does not boot.

I feel we can fix this by converting cpucontrol to a reader-writer semaphore or
big-reader-lock(?). One problem with reader-writer semaphore is there does not
seem to be any down_write_interruptible, which is needed by cpu_down/up.

Comments?

BTW, I think a cpu_is_offline check is needed in sched_migrate_task, since
dest_cpu could have been downed by the time it has acquired the semaphore.
In which case, we could end up adding the task to dead cpu's runqueue?
An alternate solution would be to put the same check in __migrate_task.



--


Thanks and Regards,
Srivatsa Vaddagiri,
Linux Technology Center,
IBM Software Labs,
Bangalore, INDIA - 560017
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/