Re: BUG: sleeping function called from invalid context at ./include/linux/uaccess.h:LINE

From: David Hildenbrand
Date: Mon Nov 06 2017 - 06:52:10 EST


On 31.10.2017 12:34, syzbot wrote:
> Hello,
>
> syzkaller hit the following crash on
> 91dfed74eabcdae9378131546c446442c29bf769
> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached
> Raw console output is attached.
> C reproducer is attached
> syzkaller reproducer is attached. See https://goo.gl/kgGztJ
> for information about syzkaller reproducers
>
>
> in_atomic(): 1, irqs_disabled(): 0, pid: 2909, name: syzkaller879109
> 2 locks held by syzkaller879109/2909:
> #0: (&vcpu->mutex){+.+.}, at: [<ffffffff8106222c>] vcpu_load+0x1c/0x70
> arch/x86/kvm/../../../virt/kvm/kvm_main.c:154
> #1: (&kvm->srcu){....}, at: [<ffffffff810dd162>] vcpu_enter_guest
> arch/x86/kvm/x86.c:6983 [inline]
> #1: (&kvm->srcu){....}, at: [<ffffffff810dd162>] vcpu_run
> arch/x86/kvm/x86.c:7061 [inline]
> #1: (&kvm->srcu){....}, at: [<ffffffff810dd162>]
> kvm_arch_vcpu_ioctl_run+0x1bc2/0x58b0 arch/x86/kvm/x86.c:7222
> CPU: 1 PID: 2909 Comm: syzkaller879109 Not tainted 4.13.0-rc4-next-20170811
> #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Call Trace:
> __dump_stack lib/dump_stack.c:16 [inline]
> dump_stack+0x194/0x257 lib/dump_stack.c:52
> ___might_sleep+0x2b2/0x470 kernel/sched/core.c:6014
> __might_sleep+0x95/0x190 kernel/sched/core.c:5967
> __might_fault+0xab/0x1d0 mm/memory.c:4383
> __copy_from_user include/linux/uaccess.h:71 [inline]
> __kvm_read_guest_page+0x58/0xa0
> arch/x86/kvm/../../../virt/kvm/kvm_main.c:1771
> kvm_vcpu_read_guest_page+0x44/0x60
> arch/x86/kvm/../../../virt/kvm/kvm_main.c:1791
> kvm_read_guest_virt_helper+0x76/0x140 arch/x86/kvm/x86.c:4407
> kvm_read_guest_virt_system+0x3c/0x50 arch/x86/kvm/x86.c:4466
> segmented_read_std+0x10c/0x180 arch/x86/kvm/emulate.c:819
> em_fxrstor+0x27b/0x410 arch/x86/kvm/emulate.c:4022


In em_fxrstor, we do a get_fpu(), which in return disables preemption.

With preempt_disable(), we do a
segmented_read_std()->kvm_vcpu_read_guest_page(), triggering the warning.

> x86_emulate_insn+0x55d/0x3c50 arch/x86/kvm/emulate.c:5471
> x86_emulate_instruction+0x411/0x1ca0 arch/x86/kvm/x86.c:5698
> kvm_mmu_page_fault+0x18b/0x2c0 arch/x86/kvm/mmu.c:4854
> handle_ept_violation+0x1fc/0x5e0 arch/x86/kvm/vmx.c:6400
> vmx_handle_exit+0x281/0x1ab0 arch/x86/kvm/vmx.c:8718
> vcpu_enter_guest arch/x86/kvm/x86.c:6999 [inline]
> vcpu_run arch/x86/kvm/x86.c:7061 [inline]
> kvm_arch_vcpu_ioctl_run+0x1cee/0x58b0 arch/x86/kvm/x86.c:7222
> kvm_vcpu_ioctl+0x64c/0x1010 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2591
> vfs_ioctl fs/ioctl.c:45 [inline]
> do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:685
> SYSC_ioctl fs/ioctl.c:700 [inline]
> SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691
> entry_SYSCALL_64_fastpath+0x1f/0xbe

I don't really see a way to avoid two fxstate variables. Unloading the
fpu in between the fxstore/fxrstr could lead to host values getting
overwritten. Loading/saving the fpu in kvm_arch_vcpu_ioctl_run() would
most probably also not work, as the relevant portions of fxregs_state
would not get saved/restored. So the preemption would still be needed.


So all I can offer for now is the following (untested, can send as
proper patch if needed):