Re: [PATCH v7 2/2] misc: Add a mechanism to detect stalls on guest vCPUs

From: Marc Zyngier
Date: Tue Jun 21 2022 - 05:24:39 EST


On Tue, 21 Jun 2022 09:54:35 +0100,
Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On Tue, Jun 21, 2022 at 09:44:35AM +0100, Marc Zyngier wrote:
> > On 2022-06-21 09:27, Greg Kroah-Hartman wrote:
> > > On Tue, Jun 21, 2022 at 08:03:09AM +0000, Sebastian Ene wrote:
> > > > This driver creates per-cpu hrtimers which are required to do the
> > > > periodic 'pet' operation. On a conventional watchdog-core driver, the
> > > > userspace is responsible for delivering the 'pet' events by writing to
> > > > the particular /dev/watchdogN node. In this case we require a strong
> > > > thread affinity to be able to account for lost time on a per vCPU.
> > > >
> > > > This part of the driver is the 'frontend' which is reponsible for
> > > > delivering the periodic 'pet' events, configuring the virtual
> > > > peripheral
> > > > and listening for cpu hotplug events. The other part of the driver
> > > > handles the peripheral emulation and this part accounts for lost
> > > > time by
> > > > looking at the /proc/{}/task/{}/stat entries and is located here:
> > > > https://chromium-review.googlesource.com/c/chromiumos/platform/crosvm/+/3548817
> > > >
> > > > Signed-off-by: Sebastian Ene <sebastianene@xxxxxxxxxx>
> > > > ---
> > > > drivers/misc/Kconfig | 12 ++
> > > > drivers/misc/Makefile | 1 +
> > > > drivers/misc/vcpu_stall_detector.c | 222
> > > > +++++++++++++++++++++++++++++
> > > > 3 files changed, 235 insertions(+)
> > > > create mode 100644 drivers/misc/vcpu_stall_detector.c
> > > >
> > > > diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
> > > > index 41d2bb0ae23a..e15c85d74c4b 100644
> > > > --- a/drivers/misc/Kconfig
> > > > +++ b/drivers/misc/Kconfig
> > > > @@ -483,6 +483,18 @@ config OPEN_DICE
> > > >
> > > > If unsure, say N.
> > > >
> > > > +config VCPU_STALL_DETECTOR
> > > > + tristate "VCPU stall detector"
> > > > + select LOCKUP_DETECTOR
> > > > + help
> > > > + Detect CPU locks on a kvm virtual machine. This driver relies on
> > > > + the hrtimers which are CPU-binded to do the 'pet' operation.
> > > > When a
> > > > + vCPU has to do a 'pet', it exits the guest through MMIO write and
> > > > + the backend driver takes into account the lost ticks for this
> > > > + particular CPU.
> > > > + To compile this driver as a module, choose M here: the
> > > > + module will be called vcpu_stall_detector.
> > >
> > > Should this depend on KVM_GUEST?
> >
> > Not all architectures have KVM_GUEST, and arm64 has no use for it.
>
> Ah, I thought this was a requirement (or created a better guest image)
> for use under KVM. Nevermind then...

It really depends whether an architecture relies on non-architectural
extensions to support KVM guests. PPC does most of the time, x86
certainly works better with the knowledge that this is a KVM guest.

KVM on arm64 implements the architecture itself, and hardly anything
else (if something sucks in virt, it also likely sucks bare metal).
The couple of KVM-specific options we support are definitely not worth
a KVM_GUEST, as they only cover pretty esoteric stuff that nobody
enables, such as PTP_1588_CLOCK_KVM.

M.

--
Without deviation from the norm, progress is not possible.