Re: [PATCH] hotplug: Optimize {get,put}_online_cpus()

From: Oleg Nesterov
Date: Sat Sep 28 2013 - 12:38:21 EST


On 09/28, Peter Zijlstra wrote:
>
> On Sat, Sep 28, 2013 at 02:48:59PM +0200, Oleg Nesterov wrote:
>
> > Please note that this wait_event() adds a problem... it doesn't allow
> > to "offload" the final synchronize_sched(). Suppose a 4k cpu machine
> > does disable_nonboot_cpus(), we do not want 2 * 4k * synchronize_sched's
> > in this case. We can solve this, but this wait_event() complicates
> > the problem.
>
> That seems like a particularly easy fix; something like so?

Yes, but...

> @@ -586,6 +603,11 @@ int disable_nonboot_cpus(void)
>
> + cpu_hotplug_done();
> +
> + for_each_cpu(cpu, frozen_cpus)
> + cpu_notify_nofail(CPU_POST_DEAD_FROZEN, (void*)(long)cpu);

This changes the protocol, I simply do not know if it is fine in general
to do __cpu_down(another_cpu) without CPU_POST_DEAD(previous_cpu). Say,
currently it is possible that CPU_DOWN_PREPARE takes some global lock
released by CPU_DOWN_FAILED or CPU_POST_DEAD.

Hmm. Now that workqueues do not use CPU_POST_DEAD, it has only 2 users,
mce_cpu_callback() and cpufreq_cpu_callback() and the 1st one even ignores
this notification if FROZEN. So yes, probably this is fine, but needs an
ack from cpufreq maintainers (cc'ed), for example to ensure that it is
fine to call __cpufreq_remove_dev_prepare() twice without _finish().

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/