RE: [RFC PATCH] KVM: Introduce KVM VIRTIO device

From: Tian, Kevin
Date: Fri Dec 15 2023 - 01:24:07 EST


> From: Zhao, Yan Y <yan.y.zhao@xxxxxxxxx>
> Sent: Thursday, December 14, 2023 6:35 PM
>
> - For host non-MMIO pages,
> * virtio guest frontend and host backend driver should be synced to use
> the same memory type to map a buffer. Otherwise, there will be
> potential problem for incorrect memory data. But this will only impact
> the buggy guest alone.
> * for live migration,
> as QEMU will read all guest memory during live migration, page aliasing
> could happen.
> Current thinking is to disable live migration if a virtio device has
> indicated its noncoherent state.
> As a follow-up, we can discuss other solutions. e.g.
> (a) switching back to coherent path before starting live migration.

both guest/host switching to coherent or host-only?

host-only certainly is problematic if guest is still using non-coherent.

on the other hand I'm not sure whether the host/guest gfx stack is
capable of switching between coherent and non-coherent path in-fly
when the buffer is right being rendered.

> (b) read/write of guest memory with clflush during live migration.

write is irrelevant as it's only done in the resume path where the
guest is not running.

>
> Implementation Consideration
> ===
> There is a previous series [1] from google to serve the same purpose to
> let KVM be aware of virtio GPU's noncoherent DMA status. That series
> requires a new memslot flag, and special memslots in user space.
>
> We don't choose to use memslot flag to request honoring guest memory
> type.

memslot flag has the potential to restrict the impact e.g. when using
clflush-before-read in migration? Of course the implication is to
honor guest type only for the selected slot in KVM instead of applying
to the entire guest memory as in previous series (which selects this
way because vmx_get_mt_mask() is in perf-critical path hence not
good to check memslot flag?)

> Instead we hope to make the honoring request to be explicit (not tied to a
> memslot flag). This is because once guest memory type is honored, not only
> memory used by guest virtio device, but all guest memory is facing page
> aliasing issue potentially. KVM needs a generic solution to take care of
> page aliasing issue rather than counting on memory type of a special
> memslot being aligned in host and guest.
> (we can discuss what a generic solution to handle page aliasing issue will
> look like in later follow-up series).
>
> On the other hand, we choose to introduce a KVM virtio device rather than
> just provide an ioctl to wrap kvm_arch_[un]register_noncoherent_dma()
> directly, which is based on considerations that

I wonder it's over-engineered for the purpose.

why not just introducing a KVM_CAP and allowing the VMM to enable?
KVM doesn't need to know the exact source of requiring it...