Re: discuss about pvpanic

From: Paolo Bonzini
Date: Wed Jan 08 2020 - 04:37:03 EST


On 08/01/20 09:25, zhenwei pi wrote:
> Hey, Paolo
>
> Currently, pvpapic only supports bit 0(PVPANIC_PANICKED).
> We usually expect that guest writes ioport (typical 0x505) in panic_notifier_list callback
> during handling panic, then we can handle pvpapic event PVPANIC_PANICKED in QEMU.
>
> On the other hand, guest wants to handle the crash by kdump-tools, and reboots without any
> panic_notifier_list callback. So QEMU only knows that guest has rebooted (because guest
> write 0xcf9 ioport for RCR request), but QEMU can't identify why guest resets.
>
> In production environment, we hit about 100+ guest reboot event everyday, sadly we
> can't separate the abnormal reboot from normal operation.
>
> We want to add a new bit for pvpanic event(maybe PVPANIC_CRASHLOADED) to represent the guest has crashed,
> and the panic is handled by the guest kernel. (here is the previous patch https://lkml.org/lkml/2019/12/14/265)
>
> What do you think about this solution? Or do you have any other suggestions?

Hi Zhenwei,

the kernel-side patch certainly makes sense. I assume that you want the
event to propagate up from QEMU to Libvirt and so on? The QEMU patch
would need to declare a new event (qapi/misc.json) and send it in
handle_event (hw/misc/pvpanic.c). For Libvirt I'm not familiar, so I'm
adding the respective list.

Another possibility is to simply not write to pvpanic if
kexec_crash_loaded() returns true; this would match what xen_panic_event
does for example. The kexec kernel would then log the panic normally,
without the need for MMIO at all. However, I have no problem with
adding a new bit to the pvpanic I/O port so once you post the QEMU patch
I will certainly ack the kernel side.

Thanks,

Paolo