Re: [PATCH v2] KVM: Add coalesced PIO support

From: Radim KrÄmÃÅ
Date: Wed Jul 18 2018 - 10:43:19 EST


2018-07-12 09:59+0800, Wanpeng Li:
> From: Peng Hao <peng.hao2@xxxxxxxxxx>
>
> Windows I/O, such as the real-time clock. The address register (port
> 0x70 in the RTC case) can use coalesced I/O, cutting the number of
> userspace exits by half when reading or writing the RTC.
>
> Guest access rtc like this: write register index to 0x70, then write or
> read data from 0x71. writing 0x70 port is just as index and do nothing
> else. So we can use coalesced mmio to handle this scene to reduce VM-EXIT
> time.
>
> In our environment, 12 windows guests running on a Skylake server:
>
> Before patch:
>
> IO Port Access Samples Samples% Time% Avg time
>
> 0x70:POUT 20675 46.04% 92.72% 67.15us ( +- 7.93% )
>
> After patch:
>
> IO Port Access Samples Samples% Time% Avg time
>
> 0x70:POUT 17509 45.42% 42.08% 6.37us ( +- 20.37% )
>
> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> Cc: Radim KrÄmÃÅ <rkrcmar@xxxxxxxxxx>
> Cc: Eduardo Habkost <ehabkost@xxxxxxxxxx>
> Cc: Peng Hao <peng.hao2@xxxxxxxxxx>
> Signed-off-by: Peng Hao <peng.hao2@xxxxxxxxxx>
> Signed-off-by: Wanpeng Li <wanpengli@xxxxxxxxxxx>
> ---
> v1 -> v2:
> * add the original author
>
> Documentation/virtual/kvm/00-INDEX | 2 ++
> Documentation/virtual/kvm/api.txt | 7 +++++++
> Documentation/virtual/kvm/coalesced-io.txt | 17 +++++++++++++++++
> include/uapi/linux/kvm.h | 5 +++--
> virt/kvm/coalesced_mmio.c | 16 +++++++++++++---
> virt/kvm/kvm_main.c | 2 ++
> 6 files changed, 44 insertions(+), 5 deletions(-)
> create mode 100644 Documentation/virtual/kvm/coalesced-io.txt
>
> diff --git a/Documentation/virtual/kvm/00-INDEX b/Documentation/virtual/kvm/00-INDEX
> index 3492458..4160620 100644
> --- a/Documentation/virtual/kvm/00-INDEX
> +++ b/Documentation/virtual/kvm/00-INDEX
> @@ -9,6 +9,8 @@ arm
> - internal ABI between the kernel and HYP (for arm/arm64)
> cpuid.txt
> - KVM-specific cpuid leaves (x86).
> +coalesced-io.txt
> + - Coalesced MMIO and coalesced PIO.
> devices/
> - KVM_CAP_DEVICE_CTRL userspace API.
> halt-polling.txt
> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> index d10944e..4190796 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -4618,3 +4618,10 @@ This capability indicates that KVM supports paravirtualized Hyper-V TLB Flush
> hypercalls:
> HvFlushVirtualAddressSpace, HvFlushVirtualAddressSpaceEx,
> HvFlushVirtualAddressList, HvFlushVirtualAddressListEx.
> +
> +8.19 KVM_CAP_COALESCED_PIO
> +
> +Architectures: x86, s390, ppc, arm64
> +
> +This Capability indicates that kvm supports writing to a coalesced-pio region
> +is not reported to userspace until the next non-coalesced pio is issued.
> diff --git a/Documentation/virtual/kvm/coalesced-io.txt b/Documentation/virtual/kvm/coalesced-io.txt
> new file mode 100644
> index 0000000..4a96eaf
> --- /dev/null
> +++ b/Documentation/virtual/kvm/coalesced-io.txt
> @@ -0,0 +1,17 @@
> +----
> +Coalesced MMIO and coalesced PIO can be used to optimize writes to
> +simple device registers. Writes to a coalesced-I/O region are not
> +reported to userspace until the next non-coalesced I/O is issued,
> +in a similar fashion to write combining hardware. In KVM, coalesced
> +writes are handled in the kernel without exits to userspace, and
> +are thus several times faster.
> +
> +Examples of devices that can benefit from coalesced I/O include:
> +
> +- devices whose memory is accessed with many consecutive writes, for
> + example the EGA/VGA video RAM.
> +
> +- windows I/O, such as the real-time clock. The address register (port
> + 0x70 in the RTC case) can use coalesced I/O, cutting the number of
> + userspace exits by half when reading or writing the RTC.
> +----
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index b6270a3..9cc56d3 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -420,13 +420,13 @@ struct kvm_run {
> struct kvm_coalesced_mmio_zone {
> __u64 addr;
> __u32 size;
> - __u32 pad;
> + __u32 pio;

Paolo, do you think we can rename the field without breaking userspace
builds?

> };
>
> struct kvm_coalesced_mmio {
> __u64 phys_addr;
> __u32 len;
> - __u32 pad;
> + __u32 pio;
> __u8 data[8];
> };
>
> diff --git a/virt/kvm/coalesced_mmio.c b/virt/kvm/coalesced_mmio.c
> @@ -149,8 +150,12 @@ int kvm_vm_ioctl_register_coalesced_mmio(struct kvm *kvm,
> dev->zone = *zone;
>
> mutex_lock(&kvm->slots_lock);
> - ret = kvm_io_bus_register_dev(kvm, KVM_MMIO_BUS, zone->addr,
> - zone->size, &dev->dev);
> + if (zone->pio)
> + ret = kvm_io_bus_register_dev(kvm, KVM_PIO_BUS, zone->addr,
> + zone->size, &dev->dev);
> + else
> + ret = kvm_io_bus_register_dev(kvm, KVM_MMIO_BUS, zone->addr,
> + zone->size, &dev->dev);

This would be better readable as

ret = kvm_io_bus_register_dev(kvm, zone->pio ? KVM_PIO_BUS : KVM_MMIO_BUS,
zone->addr, zone->size, &dev->dev);

thanks.