Re: [Patch v3 Part2 3/9] x86/microcode/intel: Fix collect_cpu_info() to reflect current microcode

From: Ashok Raj
Date: Wed Feb 01 2023 - 10:15:21 EST


On Wed, Feb 01, 2023 at 01:53:32PM +0100, Borislav Petkov wrote:
> On Tue, Jan 31, 2023 at 10:43:23PM +0000, Luck, Tony wrote:
> > In an ideal world yes. But what if T1 arrives here and tries to do the
> > update while T0, which has returned out of the microcode update
> > code and could be doing anything, happen to be doing WRMSR(some MSR
> > that the ucode update is tinkering with).
> >
> > Now T0 explodes (not literally, I hope!) but does something crazy because
> > it was in the middle of some microcode flow that got updated between two
> > operations.
>
> So first of all, I'm wondering whether the scenario you're chasing is
> something completely hypothetical or you're actually thinking of
> something concrete which has actually happened or there's high potential
> for it.
>
> In that case, that late patching sync algorithm would need to be made
> more robust to handle cases like that.

That's correct. But fundamentally we sent the sibling down the
apply_microcode() path just to make sure the per-thread info is updated.

It appears the code is using a side effect that the revision got updated
even though we don't actually intend to perform a wrmsr on the sibling
in the normal case that primary completes the update.

If the purpose is only to update the revision, using the collect_cpu_info()
which seems more appropriate for that purpose, and doesn't have any
implied issues with using a wrmsr flow. It's not broken today, but the code
isn't future proof. Calling the revision update only keeps those questions
at bay.

I think this is what Thomas implied to cleanup in his comments.

>
> Because from what I'm reading above, this doesn't sound like the
> reporting is wrong only but more like, if T0 fails the update and T1
> gets to do that update for a change, then crap can happen.
>
> Which means, our update dance cannot handle that case properly.
>

It doesn't need to if we don't do an apply_microcode() for the sibling.

Cheers,
Ashok