Re: [RFC][PATCH 2/2] x86: add extra serialization for non-serializing MSRs

From: Peter Zijlstra
Date: Fri Feb 05 2021 - 19:28:02 EST


On Fri, Feb 05, 2021 at 11:02:10AM +0100, Peter Zijlstra wrote:

> And presumably it is still allowed to do that when we write it like:
>
> mov $1, ([x])
> mfence
> wrmsr
>
> because, mfence only has dependencies to memops and (fast) wrmsr is not
> a memop.
>
> Which then brings us to:
>
> mov $1, ([x])
> mfence
> lfence
> wrmsr
>
> In this case, the lfence acts like the newly minted ifence (see
> spectre), and will block execution of (any) later instructions until
> completion of all prior instructions. This, and only this ensures the
> wrmsr happens after the mfence, which in turn ensures the store to x is
> globally visible.

Note that I too do have a few questions.

Supposedly MFENCE is our LOAD/STORE completion fence of choice, and this
obviously works with MMIO, since that's memops. The MMIO write of the
buffer address to the DMA device must happen after completion of the
previous data writes etc..

But what about the legacy IN/OUT ports? Are those memops? If not, we
might need additional LFENCEs there too.

Also, would SFENCE+LFENCE be sufficient for the WRMSR case? AFAIU SFENCE
is the store completion barrier and should be strong enough to flush all
store buffers. If not, why not?