Re: [PATCH 1/5] KVM: x86: Remove VMX support for virtualizing guest MTRR memtypes

From: Dongli Zhang
Date: Thu Mar 14 2024 - 06:31:41 EST




On 3/12/24 10:08, Sean Christopherson wrote:
> On Mon, Mar 11, 2024, Dongli Zhang wrote:
>>
>>
>> On 3/8/24 17:09, Sean Christopherson wrote:
>>> Remove KVM's support for virtualizing guest MTRR memtypes, as full MTRR
>>> adds no value, negatively impacts guest performance, and is a maintenance
>>> burden due to it's complexity and oddities.
>>>
>>> KVM's approach to virtualizating MTRRs make no sense, at all. KVM *only*
>>> honors guest MTRR memtypes if EPT is enabled *and* the guest has a device
>>> that may perform non-coherent DMA access. From a hardware virtualization
>>> perspective of guest MTRRs, there is _nothing_ special about EPT. Legacy
>>> shadowing paging doesn't magically account for guest MTRRs, nor does NPT.
>>
>> [snip]
>>
>>>
>>> -bool __kvm_mmu_honors_guest_mtrrs(bool vm_has_noncoherent_dma)
>>> +bool kvm_mmu_may_ignore_guest_pat(void)
>>> {
>>> /*
>>> - * If host MTRRs are ignored (shadow_memtype_mask is non-zero), and the
>>> - * VM has non-coherent DMA (DMA doesn't snoop CPU caches), KVM's ABI is
>>> - * to honor the memtype from the guest's MTRRs so that guest accesses
>>> - * to memory that is DMA'd aren't cached against the guest's wishes.
>>> - *
>>> - * Note, KVM may still ultimately ignore guest MTRRs for certain PFNs,
>>> - * e.g. KVM will force UC memtype for host MMIO.
>>> + * When EPT is enabled (shadow_memtype_mask is non-zero), and the VM
>>> + * has non-coherent DMA (DMA doesn't snoop CPU caches), KVM's ABI is to
>>> + * honor the memtype from the guest's PAT so that guest accesses to
>>> + * memory that is DMA'd aren't cached against the guest's wishes. As a
>>> + * result, KVM _may_ ignore guest PAT, whereas without non-coherent DMA,
>>> + * KVM _always_ ignores guest PAT (when EPT is enabled).
>>> */
>>> - return vm_has_noncoherent_dma && shadow_memtype_mask;
>>> + return shadow_memtype_mask;
>>> }
>>>
>>
>> Any special reason to use the naming 'may_ignore_guest_pat', but not
>> 'may_honor_guest_pat'?
>
> Because which (after this series) is would either be misleading or outright wrong.
> If KVM returns true from the helper based solely on shadow_memtype_mask, then it's
> misleading because KVM will *always* honors guest PAT for such CPUs. I.e. that
> name would yield this misleading statement.
>
> If the CPU supports self-snoop, KVM may honor guest PAT.
>
> If KVM returns true iff self-snoop is NOT available (as proposed in this series),
> then it's outright wrong as KVM would return false, i.e. would make this incorrect
> statement:
>
> If the CPU supports self-snoop, KVM never honors guest PAT.
>
> As saying that KVM may not or cannot do something is saying that KVM will never
> do that thing.
>
> And because the EPT flag is "ignore guest PAT", not "honor guest PAT", but that's
> as much coincidence as it is anything else.
>
>> Since it is also controlled by other cases, e.g., kvm_arch_has_noncoherent_dma()
>> at vmx_get_mt_mask(), it can be 'may_honor_guest_pat' too?
>>
>> Therefore, why not directly use 'shadow_memtype_mask' (without the API), or some
>> naming like "ept_enabled_for_hardware".
>
> Again, after this series, KVM will *always* honor guest PAT for CPUs with self-snoop,
> i.e. KVM will *never* ignore guest PAT. But for CPUs without self-snoop (or with
> errata), KVM conditionally honors/ignores guest PAT.
>
>> Even with the code from PATCH 5/5, we still have high chance that VM has
>> non-coherent DMA?
>
> I don't follow. On CPUs with self-snoop, whether or not the VM has non-coherent
> DMA (from VFIO!) is irrelevant. If the CPU has self-snoop, then KVM can safely
> honor guest PAT at all times.


Thank you very much for the explanation.

According to my understanding of the explanation (after this series):

1. When static_cpu_has(X86_FEATURE_SELFSNOOP) == true, it is 100% to "honor
guest PAT".

2. When static_cpu_has(X86_FEATURE_SELFSNOOP) == false (and
shadow_memtype_mask), although only 50% chance (depending on where there is
non-coherent DMA), at least now it is NOT 100% (to honor guest PAT) any longer.

Due to the fact it is not 100% (to honor guest PAT) any longer, there starts the
trend (from 100% to 50%) to "ignore guest PAT", that is:
kvm_mmu_may_ignore_guest_pat().

Dongli Zhang