Re: [tip:x86/pti] x86/cpu/AMD: Use LFENCE_RDTSC instead of MFENCE_RDTSC

From: Andrew Cooper
Date: Mon Jan 08 2018 - 11:33:36 EST


On 08/01/18 14:47, Tom Lendacky wrote:
> On 1/8/2018 5:10 AM, Thomas Gleixner wrote:
>> On Mon, 8 Jan 2018, Andrew Cooper wrote:
>>
>>> On 08/01/18 10:08, Thomas Gleixner wrote:
>>>> On Sat, 6 Jan 2018, tip-bot for Tom Lendacky wrote:
>>>>
>>>>> Commit-ID: 0bf17c102177d5da9363bf8b1e4704b9996d5079
>>>>> Gitweb: https://git.kernel.org/tip/0bf17c102177d5da9363bf8b1e4704b9996d5079
>>>>> Author: Tom Lendacky <thomas.lendacky@xxxxxxx>
>>>>> AuthorDate: Fri, 5 Jan 2018 10:07:56 -0600
>>>>> Committer: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
>>>>> CommitDate: Sat, 6 Jan 2018 21:57:40 +0100
>>>>>
>>>>> x86/cpu/AMD: Use LFENCE_RDTSC instead of MFENCE_RDTSC
>>>>>
>>>>> With LFENCE now a serializing instruction, set the LFENCE_RDTSC
>>>>> feature since the LFENCE instruction has less overhead than the
>>>>> MFENCE instruction.
>>>> Second thoughts on that. As pointed out by someone in one of the insane
>>>> long threads:
>>>>
>>>> What happens if the kernel runs as a guest and
>>>>
>>>> - the hypervisor did not set the LFENCE to serializing on the host
>>>>
>>>> - the hypervisor does not allow writing MSR_AMD64_DE_CFG
>>>>
>>>> That would bring the guest into a pretty bad state or am I missing
>>>> something essential here?
>>> What I did in Xen was to attempt to set it, then read it back and see.Â
>>> If LFENCE still isn't serialising, using repoline is the only available
>>> mitigation.
>>>
>>> My understanding from the folk at AMD is that retpoline is safe to use,
>>> but has higher overhead than the LFENCE approach.
> Correct, the retpoline will work, it just takes more cycles.
>
>> That still does not help vs. rdtsc_ordered() and LFENCE_RDTSC ...
> Ok, I can add the read-back check before setting the feature flag(s).
>
> But... what about the case where the guest is a different family than
> hypervisor? If we're on, say, a Fam15h hypervisor but the guest is started
> as a Fam0fh guest where the MSR doesn't exist and LFENCE is supposed to be
> serialized? I'll have to do a rdmsr_safe() and only set the flag(s) if I
> can successfully read the MSR back and validate the bit.

If your hypervisor is lying to you about the primary family, then all
bets are off. I don't expect there will be any production systems doing
this.

The user can get to keep both pieces if they've decided that this was a
good thing to try.

~Andrew