Re: [PATCH] kvm/vfio: ensure kvg instance stays around in kvm_vfio_group_add()

From: Alex Williamson
Date: Thu Jul 13 2023 - 14:49:12 EST


On Mon, 10 Jul 2023 15:20:31 -0700
Dmitry Torokhov <dmitry.torokhov@xxxxxxxxx> wrote:

> kvm_vfio_group_add() creates kvg instance, links it to kv->group_list,
> and calls kvm_vfio_file_set_kvm() with kvg->file as an argument after
> dropping kv->lock. If we race group addition and deletion calls, kvg
> instance may get freed by the time we get around to calling
> kvm_vfio_file_set_kvm().
>
> Fix this by moving call to kvm_vfio_file_set_kvm() under the protection
> of kv->lock. We already call it while holding the same lock when vfio
> group is being deleted, so it should be safe here as well.
>
> Fixes: ba70a89f3c2a ("vfio: Change vfio_group_set_kvm() to vfio_file_set_kvm()")
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@xxxxxxxxx>
> ---
> virt/kvm/vfio.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c
> index 9584eb57e0ed..cd46d7ef98d6 100644
> --- a/virt/kvm/vfio.c
> +++ b/virt/kvm/vfio.c
> @@ -179,10 +179,10 @@ static int kvm_vfio_group_add(struct kvm_device *dev, unsigned int fd)
> list_add_tail(&kvg->node, &kv->group_list);
>
> kvm_arch_start_assignment(dev->kvm);
> + kvm_vfio_file_set_kvm(kvg->file, dev->kvm);
>
> mutex_unlock(&kv->lock);
>
> - kvm_vfio_file_set_kvm(kvg->file, dev->kvm);
> kvm_vfio_update_coherency(dev);
>
> return 0;


I'm not sure this hasn't been an issue since it was originally
introduced in 2fc1bec15883 ("kvm: set/clear kvm to/from vfio_group when
group add/delete").

The change added by the blamed ba70a89f3c2a in this respect is simply
that we get the file pointer from the mutex protected object, but that
mutex protected object is also what maintains that the file pointer is
valid. The vfio_group implementation suffered the same issue, the
delete path could put the group reference, which could theoretically
cause a use after free of the vfio_group.

We could effectively restore the pre-ba70a89f3c2a behavior by replacing
kvg->file with filp here, but that would still leave us vulnerable to
the original issue.

Note also that kvm_vfio_update_coherency() takes the same mutex
separately, I wonder if it wouldn't make more sense if it were moved
under the caller's mutex to avoid bouncing the lock and unnecessarily
taking it in the release path. Thanks,

Alex