RE: [PATCH v9 12/12] x86, topology: Debug CPU00 hotplug

From: Yu, Fenghua
Date: Thu Oct 18 2012 - 21:07:56 EST


> From: Srivatsa S. Bhat [mailto:srivatsa.bhat@xxxxxxxxxxxxxxxxxx]
> Sent: Wednesday, October 17, 2012 11:43 PM
> On 10/17/2012 05:36 AM, Yu, Fenghua wrote:
> >>> +#ifdef CONFIG_DEBUG_HOTPLUG_CPU0
> > > + case PM_RESTORE_PREPARE:
> >>> + /*
> >>> + * When system resumes from hibernation, online CPU0
> >> because
> >>> + * 1. it's required for resume and
> >>> + * 2. the CPU was online before hibernation
> >>> + */
> >>> + if (!cpu_online(0))
> >>> + _debug_hotplug_cpu(0, 1);
> >>
> >> This won't work. CPU hotplug is disabled by cpu_hotplug_pm_callback()
> >> during
> >> PM_HIBERNATION_PREPARE and is enabled back only during
> >> PM_POST_HIBERNATION.
> >>
> >> So calls to cpu_up() or cpu_down() will fail in the meantime.
> >
> > The code works just fine.
> >
> > You are talking about hibernation/suspend phase. This piece of
> > code is all about restore phase. Of course you can call cpu_up()
> > and cpu_down() during restore phase.
> >
> > So there is no problem here.
> >
>
> I looked into it again, and actually you are right, it'll work fine;
> but the reason *why* it will work fine is quite subtle.
>
> Note that there is a difference between calling cpu_up() or cpu_down()
> vs calling _cpu_up() or _cpu_down(). The former set of calls will
> return
> -EBUSY if the 'cpu_hotplug_disabled' flag is set. The latter set of
> calls
> don't care about that flag and hence proceed irrespective of the state
> of
> that flag. The former set of calls are used when you echo 0/1 to the
> online
> file from userspace. Whereas the latter set of calls are used by the
> disable/enable_nonboot_cpus() functions. This is how you can disable
> cpu hotplug by userspace, but continue to do cpu hotplug yourself from
> within the kernel in the suspend/hibernation/restore phases.
>
> Now, coming to the notifications part, earlier, I was referring to the
> entire hibernate->restore phase, not just the hibernate phase. The
> relevant
> notifications that are sent out during this entire (successful)
> sequence are:
>
> -> PM_HIBERNATION_PREPARE
> -> PM_RESTORE_PREPARE
> -> PM_POST_HIBERNATION
>
> cpu_hotplug_pm_callback() will set the cpu_hotplug_disabled flag during
> PM_HIBERNATION_PREPARE and clears it only during PM_POST_HIBERNATION.
> So in-between, if you invoke cpu_up() or cpu_down(), it should return -
> EBUSY.
>
> Then how is it that this patch can work fine? The answer is, during the
> restore phase, you are not yet in the kernel-to-be-restored. IOW, that
> cpu_hotplug_disabled flag is clear in that kernel. That's why cpu_up()
> and
> cpu_down() will continue to work at that point. But once you resume
> execution
> from the saved hibernation image, you won't be able to do cpu_up() and
> cpu_down() until that flag is cleared by cpu_hotplug_pm_callback()
> during
> PM_POST_HIBERNATION.
>
> All in all, its tricky, and luckily this patch uses cpu_up()/cpu_down()
> at the right moment...
>
> I don't actually see why cpu_hotplug_pm_callback() should not clear the
> cpu_hotplug_disabled flag during PM_RESTORE_PREPARE itself.. if we do
> that,
> we don't have to be *this* careful about -when- we can invoke -which-
> function..

Yes, this is tricky because we have to offline CPU0 in debug mode and online
CPU0 in resume. So we need to do online and offline in this short period
of boot and resume.

I explained this tick in the comment:
+ * When a resume really happens, this code won't be called.
+ *
+ * This code is called only when user space hibernation software
+ * prepares for snapshot device during boot time. So we just
+ * call _debug_hotplug_cpu() to restore to CPU0's state prior to
+ * preparing the snapshot device.
......

Thanks.

-Fenghua
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/