Re: [PATCH] KVM: Deal with nested sleeps in kvm_vcpu_block()

From: Space Meyer
Date: Fri Dec 02 2022 - 10:53:11 EST


On Wed, Nov 30, 2022 at 5:49 PM Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:
>
> On Wed, Nov 30, 2022 at 5:20 PM Space Meyer <spm@xxxxxxxxxx> wrote:
> > Previously this code assumed nothing would mess with current->state
> > between the set_current_state() and schedule(). However the call to
> > kvm_vcpu_check_block() in between might end up requiring locks or other
> > actions, which would change current->state
>
> This would be a bug (in particular kvm_arch_vcpu_runnable() and
> kvm_cpu_has_pending_timer() should not need any lock). Do you
> have a specific call stack in mind?

You already fixed the specific call stack in 26844fe upstream. Syzkaller was
able to exercise the call stack you outlined in that commit on a 5.10 based
branch via:

------------[ cut here ]------------
do not call blocking ops when !TASK_RUNNING; state=1 set at
[<00000000f941c5dd>] kvm_vcpu_block+0x330/0xaf0
WARNING: CPU: 1 PID: 2513 at kernel/sched/core.c:12350 __might_sleep+0xd7/0xe0
[...]
Call Trace:
__might_fault+0x6c/0x120
__kvm_read_guest_page+0x163/0x230
kvm_vcpu_read_guest+0xc3/0x150
read_and_check_msr_entry+0x3f/0x310
nested_vmx_store_msr+0x12c/0x360
prepare_vmcs12+0x5f2/0xd90
nested_vmx_vmexit+0x663/0x17e0
vmx_check_nested_events+0xfd8/0x1c60
kvm_arch_vcpu_runnable+0xda/0x6c0
kvm_vcpu_check_block+0x63/0x250
kvm_vcpu_block+0x3b7/0xaf0
[...]

The bug doesn't seem to be easily reproducible, but looking at the code this
should also be applicable for the upstream 6.0, 5.15, 5.10, 5.4, 4.19 and 4.14
branches, which have not received a backport of 26844fe.

Do you think this is all we should do? My conclusion from the LWN article was,
that we should avoid the set_current_state -> conditional -> schedule pattern
when possible as well.