Re: Centos 7(kernel-3.10) guests stuck in recursive asyn faults on kernel 5.10

From: Paolo Bonzini
Date: Fri Nov 11 2022 - 17:21:20 EST


Note that the series you linked is not the final one; the final
version disabled asynchronous page faults completely on old guests:

+static inline bool kvm_pv_async_pf_enabled(struct kvm_vcpu *vcpu)
+{
+ u64 mask = KVM_ASYNC_PF_ENABLED | KVM_ASYNC_PF_DELIVERY_AS_INT;
+
+ return (vcpu->arch.apf.msr_en_val & mask) == mask;
+}

(commit 2635b5c4a0e407b84f68e188c719f28ba0e9ae1b)

Old guests are not deprecated, but the old implementation had issues
that cannot be solved so old guests will not have asynchronous page
faults anymore. For this reason I think it's strange that you still
see async page fault in the guest. What do the stack traces look like?

Thanks,

Paolo

On Tue, Nov 8, 2022 at 2:33 PM Manish Mishra <manish.mishra@xxxxxxxxxxx> wrote:
>
> Hi Everyone,
>
> We are facing some issues with memory hotplug on Centos 7(kernel-3.10) guests with kernel-5.10 kvm hosts. I see guest is stuck in recursive async page faults for very long hence creating deadlocks on guests. I was looking at changes between kernel 5.2 and 5.10 around async page faults. I see this series could be related, https://lore.kernel.org/lkml/20200429093634.1514902-7-vkuznets@xxxxxxxxxx/.
>
>
>
> Have we deprecated older linux guests with 5.10 hosts, after this update in async page fault handler mechanism or this issue is unrelated. I do not have much idea knowledge of async page faults so wanted to confirm. Any help will be really appreciated.
>
> Thanks
>
> Manish Mishra