Re: [PATCH v2 2/2] kvm: Device assignment permission checks

From: Sasha Levin
Date: Tue Dec 20 2011 - 16:00:26 EST


On Tue, 2011-12-20 at 07:30 -0700, Alex Williamson wrote:
> Only allow KVM device assignment to attach to devices which:
>
> - Are not bridges
> - Have BAR resources (assume others are special devices)
> - The user has permissions to use
>
> Assigning a bridge is a configuration error, it's not supported, and
> typically doesn't result in the behavior the user is expecting anyway.
> Devices without BAR resources are typically chipset components that
> also don't have host drivers. We don't want users to hold such devices
> captive or cause system problems by fencing them off into an iommu
> domain. We determine "permission to use" by testing whether the user
> has access to the PCI sysfs resource files. By default a normal user
> will not have access to these files, so it provides a good indication
> that an administration agent has granted the user access to the device.
>
> Signed-off-by: Alex Williamson <alex.williamson@xxxxxxxxxx>
> ---
>
> Documentation/virtual/kvm/api.txt | 4 +++
> virt/kvm/assigned-dev.c | 55 ++++++++++++++++++++++++++++++++++++-
> 2 files changed, 58 insertions(+), 1 deletions(-)
>
> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> index ee2c96b..4df9af4 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -1154,6 +1154,10 @@ following flags are specified:
> The KVM_DEV_ASSIGN_ENABLE_IOMMU flag is a mandatory option to ensure
> isolation of the device. Usages not specifying this flag are deprecated.
>
> +Only PCI header type 0 devices with PCI BAR resources are supported by
> +device assignment. The user requesting this ioctl must have read/write
> +access to the PCI sysfs resource files associated with the device.
> +
> 4.49 KVM_DEASSIGN_PCI_DEVICE
>
> Capability: KVM_CAP_DEVICE_DEASSIGNMENT
> diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
> index a251a28..faec641 100644
> --- a/virt/kvm/assigned-dev.c
> +++ b/virt/kvm/assigned-dev.c
> @@ -17,6 +17,7 @@
> #include <linux/pci.h>
> #include <linux/interrupt.h>
> #include <linux/slab.h>
> +#include <linux/namei.h>
> #include "irq.h"
>
> static struct kvm_assigned_dev_kernel *kvm_find_assigned_dev(struct list_head *head,
> @@ -483,9 +484,11 @@ out:
> static int kvm_vm_ioctl_assign_device(struct kvm *kvm,
> struct kvm_assigned_pci_dev *assigned_dev)
> {
> - int r = 0, idx;
> + int r = 0, idx, i;
> struct kvm_assigned_dev_kernel *match;
> struct pci_dev *dev;
> + u8 header_type;
> + bool bar_found = false;
>
> if (!(assigned_dev->flags & KVM_DEV_ASSIGN_ENABLE_IOMMU))
> return -EINVAL;
> @@ -516,6 +519,56 @@ static int kvm_vm_ioctl_assign_device(struct kvm *kvm,
> r = -EINVAL;
> goto out_free;
> }
> +
> + /* Don't allow bridges to be assigned */
> + pci_read_config_byte(dev, PCI_HEADER_TYPE, &header_type);
> + if ((header_type & PCI_HEADER_TYPE) != PCI_HEADER_TYPE_NORMAL) {
> + r = -EPERM;
> + goto out_put;
> + }
> +
> + /* We want to test whether the caller has been granted permissions to
> + * use this device. To be able to configure and control the device,
> + * the user needs access to PCI configuration space and BAR resources.
> + * These are accessed through PCI sysfs. PCI config space is often
> + * passed to the process calling this ioctl via file descriptor, so we
> + * can't rely on access to that file. We can check for permissions
> + * on each of the BAR resource files, which is a pretty clear
> + * indicator that the user has been granted access to the device. */
> + for (i = PCI_STD_RESOURCES; i <= PCI_STD_RESOURCE_END; i++) {
> + char buf[64];
> + struct path path;
> + struct inode *inode;
> +
> + if (!pci_resource_len(dev, i))
> + continue;
> +
> + /* Per sysfs-rules, sysfs is always at /sys */
> + snprintf(buf, sizeof(buf), "/sys/bus/pci/devices/%04x:%02x:"
> + "%02x.%d/resource%d", pci_domain_nr(dev->bus),
> + dev->bus->number, PCI_SLOT(dev->devfn),
> + PCI_FUNC(dev->devfn), i);

This should probably be done by grabbing devname out of
'dev' (kobject_get_path(&dev->dev.kobj, GFP_KERNEL) ) instead of
formatting it ourselves. This is also mentioned to be always correct in
sysfs-rules while this method isn't.

> +
> + r = kern_path(buf, LOOKUP_FOLLOW, &path);
> + if (r)
> + goto out_put;
> +
> + inode = path.dentry->d_inode;
> +
> + r = inode_permission(inode, MAY_READ | MAY_WRITE | MAY_ACCESS);
> + path_put(&path);
> + if (r)
> + goto out_put;
> +
> + bar_found = true;
> + }
> +
> + /* If no resources, probably something special */
> + if (!bar_found) {
> + r = -EPERM;
> + goto out_put;
> + }

Maybe it's also worth it to move this block out to a helped function and
wrap it by CONFIG_SYSFS. I'm not sure what can happen when sysfs doesn't
exist, but it's best to just avoid any of these corner cases.

--

Sasha.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/