Re: [PATCH v2 0/3] Freezer, CPU hotplug, x86 Microcode: Fix taskfreezing failures

From: tj@xxxxxxxxxx
Date: Mon Oct 10 2011 - 14:08:59 EST


Hello,

On Mon, Oct 10, 2011 at 07:53:36PM +0200, Borislav Petkov wrote:
> On Mon, Oct 10, 2011 at 01:30:34PM -0400, Srivatsa S. Bhat wrote:
> > But I do agree that offlining and onlining CPUs while suspending might
> > not seem all that useful or even wise, but like I said, it was designed to
> > bring out such problematic race conditions.
> >
> > So, in the interest of making the important components involved in
> > suspend/resume call path (namely cpu hotplug) more robust and stable,
> > I think it makes sense to fix any issue we hit (atleast when we
> > practically hit it and it is proved that such a scenario is no longer
> > hypothetical).
> >
> > For that, we can either go with the simple one-line fix that I posted
> > earlier (which has got another motivation now, thanks to Borislav) or
> > with this elaborate solution, whichever seems better/worthwhile.
> >
> > If it is still strongly felt that this "bug" is not worth fixing with such
> > mutual exclusion schemes, it will still get solved anyway by applying that
> > one-line patch.
>
> Well, this is easy: the oneliner is needed anyway for removing
> unnecessary ucode reloading and since it fixes your test cases _and_ is
> _simpler_, the whole deal is a no brainer.

Maybe I'm confused but is that patch correct for actual CPU hotplug
case? If not, what's the point in doing that? What are we gonna do
after six month some people come up with "CPU hotplug fails to load
new microcode for the new CPU"? The invalidation code is there for a
reason. The CPU is going away and the microcode tied to the CPU
should go away too. If somebody is sure that microcode don't need to
be changed once loaded, then all's good and dandy but that's not the
case here, right?

If you want to optimize away microcode unloading during
suspend/resume, the RTTD is doing revalidation / reload during
CPU_ONLINE as necessary.

If this use case doesn't really matter too much to anyone, just
leaving it alone would be better than adding band aid which can lead
to very obscure issues down the road (oooh, that microcode shouldn't
have been loaded to that cpu).

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/