Re: WARNING: refcount bug in kvm_vm_ioctl

From: Jann Horn
Date: Fri Feb 15 2019 - 11:03:13 EST


On Fri, Feb 15, 2019 at 4:40 PM Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
> On Thu, Oct 11, 2018 at 4:18 PM Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:
> > On 10/10/2018 09:58, syzbot wrote:
> > > do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:316
> > > invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:993
> > > RIP: 0010:refcount_inc_checked+0x5d/0x70 lib/refcount.c:153
> > > kvm_get_kvm arch/x86/kvm/../../../virt/kvm/kvm_main.c:766 [inline]
> > > kvm_ioctl_create_device arch/x86/kvm/../../../virt/kvm/kvm_main.c:2924
> > > kvm_vm_ioctl+0xed7/0x1d40 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3114
> > > vfs_ioctl fs/ioctl.c:46 [inline]
> > > file_ioctl fs/ioctl.c:501 [inline]
> > > do_vfs_ioctl+0x1de/0x1720 fs/ioctl.c:685
> > > ksys_ioctl+0xa9/0xd0 fs/ioctl.c:702
> > > __do_sys_ioctl fs/ioctl.c:709 [inline]
> > > __se_sys_ioctl fs/ioctl.c:707 [inline]
> > > __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:707
> > > do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
> > > entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >
> > The trace here is fairly simple, but I don't understand how this could
> > happen.
> >
> > The kvm_get_kvm is done within kvm_ioctl_create_device, which is called
> > from ioctl; the last reference cannot disappear inside a ioctl, because:
> >
> > 1) kvm_ioctl is called from vfs_ioctl, which does fdget and holds the fd
> > reference until after kvm_vm_ioctl returns
> >
> > 2) the file descriptor holds one reference to the struct kvm*, and this
> > reference is not released until kvm_vm_release is called by the last
> > fput (which could be fdput's call to fput if the process has exited in
> > the meanwhile)
> >
> > 3) for completeness, in case anon_inode_getfd fails, put_unused_fd will
> > not invoke the file descriptor's ->release callback (in this case
> > kvm_device_release).
> >
> > CCing some random people to get their opinion...
> >
> > Paolo
>
>
> Jann, is it what you fixed in "kvm: fix kvm_ioctl_create_device()
> reference counting (CVE-2019-6974)"?
> If so, we need to close the syzbot bug.
>
>
> > > # See https://goo.gl/kgGztJ for information about syzkaller reproducers.
> > > #{"threaded":true,"collide":true,"repeat":true,"procs":6,"sandbox":"none","fault_call":-1,"tun":true,"tmpdir":true,"cgroups":true,"netdev":true,"resetnet":true,"segv":true}
> > > r0 = openat$kvm(0xffffffffffffff9c, &(0x7f0000000380)='/dev/kvm\x00', 0x0, 0x0)
> > > r1 = syz_open_dev$dspn(&(0x7f0000000100)='/dev/dsp#\x00', 0x3fe, 0x400)
> > > r2 = ioctl$KVM_CREATE_VM(r0, 0xae01, 0x0)

Here we create a VM fd...

> > > perf_event_open(&(0x7f0000000040)={0x1, 0x70, 0x0, 0x0, 0x0, 0x0, 0x0, 0x50d}, 0x0, 0xffffffffffffffff, 0xffffffffffffffff, 0x0)
> > > mincore(&(0x7f0000ffc000/0x1000)=nil, 0x1000, &(0x7f00000003c0)=""/4096)
> > > setrlimit(0x0, &(0x7f0000000000))
> > > readahead(r1, 0x3, 0x9a6)
> > > ioctl$KVM_CREATE_DEVICE(r2, 0xc00caee0, &(0x7f00000002c0)={0x4})

... and here we do the KVM_CREATE_DEVICE ioctl with type==KVM_DEV_TYPE_VFIO.

So that far it looks exactly like CVE-2019-6974. But CVE-2019-6974
also requires that someone calls close() on the file descriptor of the
newly created device very quickly, before the ioctl is able to
increment the refcount further, and I don't see anything like that
here. Is there a chance that syzkaller called close() on a file
descriptor while the ioctl() was still running without saying so here
(potentially through dup2() or something like that)?

It would be helpful if we could see the backtrace of how the refcount
was dropped to zero...

> > > setsockopt$inet_sctp6_SCTP_FRAGMENT_INTERLEAVE(r1, 0x84, 0x12, &(0x7f00000001c0)=0x9, 0x4)