Re: [PATCH/RFC] timer: fix deadlock on cpu hotplug

From: Heiko Carstens
Date: Wed Sep 22 2010 - 04:37:23 EST


On Tue, Sep 21, 2010 at 05:40:18PM +0200, Peter Zijlstra wrote:
> On Tue, 2010-09-21 at 17:36 +0200, Tejun Heo wrote:
> > I think this is the second time we're seeing deadlock during cpu down
> > due to RT throttling and timer problem. The rather delicate
> > dependency there makes me somewhat nervous. If possible, I think it
> > would be better if we can simply turn the RT throttling off when
> > cpu_stop kicks in. It's intended to be a mechanism to monopolize all
> > CPU cycles to begin with. Would that be difficult?
>
> I've wanted to pull the whole migration thread out from SCHED_FIFO for a
> while. Doing that is probably the easiest thing.

Something like this?

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 4372ccb..854fd57 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -291,7 +291,6 @@ repeat:
static int __cpuinit cpu_stop_cpu_callback(struct notifier_block *nfb,
unsigned long action, void *hcpu)
{
- struct sched_param param = { .sched_priority = MAX_RT_PRIO - 1 };
unsigned int cpu = (unsigned long)hcpu;
struct cpu_stopper *stopper = &per_cpu(cpu_stopper, cpu);
struct task_struct *p;
@@ -304,7 +303,6 @@ static int __cpuinit cpu_stop_cpu_callback(struct notifier_block *nfb,
cpu);
if (IS_ERR(p))
return NOTIFY_BAD;
- sched_setscheduler_nocheck(p, SCHED_FIFO, &param);
get_task_struct(p);
stopper->thread = p;
break;

...gets stuck nearly immediatly on cpu hotplug stress if the machine is
doing anything but idling around.
I was too lazy to figure out why it got stuck. I'm afraid that with such
a change a new class of bugs will appear.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/