Re: [PATCH 09/11] PCI: add matching checks for driver_override binding

From: Alex Williamson
Date: Mon Jun 14 2021 - 14:43:00 EST


On Sun, 13 Jun 2021 11:19:46 +0300
Max Gurtovoy <mgurtovoy@xxxxxxxxxx> wrote:

> On 6/9/2021 4:27 AM, Alex Williamson wrote:
> > On Tue, 8 Jun 2021 19:45:17 -0300
> > Jason Gunthorpe <jgg@xxxxxxxxxx> wrote:
> >
> >> On Tue, Jun 08, 2021 at 03:26:43PM -0600, Alex Williamson wrote:
> >>>> drivers that specifically opt into this feature and the driver now has
> >>>> the opportunity to provide a proper match table that indicates what HW
> >>>> it can properly support. vfio-pci continues to support everything.
> >>> In doing so, this also breaks the new_id method for vfio-pci.
> >> Does it? How? The driver_override flag is per match entry not for the
> >> entire device so new_id added things will work the same as before as
> >> their new match entry's flags will be zero.
> > Hmm, that might have been a testing issue; combining driverctl with
> > manual new_id testing might have left a driver_override in place.
> >
> >>> Sorry, with so many userspace regressions, crippling the
> >>> driver_override interface with an assumption of such a narrow focus,
> >>> creating a vfio specific match flag, I don't see where this can go.
> >>> Thanks,
> >> On the other hand it overcomes all the objections from the last go
> >> round: how userspace figures out which driver to use with
> >> driver_override and integrating the universal driver into the scheme.
> >>
> >> pci_stub could be delt with by marking it for driver_override like
> >> vfio_pci.
> > By marking it a "vfio driver override"? :-\
> >
> >> But driverctl as a general tool working with any module is not really
> >> addressable.
> >>
> >> Is the only issue the blocking of the arbitary binding? That is not a
> >> critical peice of this, IIRC
> > We can't break userspace, which means new_id and driver_override need
> > to work as they do now. There are scads of driver binding scripts in
> > the wild, for vfio-pci and other drivers. We can't assume such a
> > narrow scope. Thanks,
>
> what about the following code ?
>
> @@ -152,12 +152,28 @@ static const struct pci_device_id
> *pci_match_device(struct pci_driver *drv,
>         }
>         spin_unlock(&drv->dynids.lock);
>
> -       if (!found_id)
> -               found_id = pci_match_id(drv->id_table, dev);
> +       if (found_id)
> +               return found_id;

a) A dynamic ID match always works regardless of driver override...

>
> -       /* driver_override will always match, send a dummy id */
> -       if (!found_id && dev->driver_override)
> +       found_id = pci_match_id(drv->id_table, dev);
> +       if (found_id) {
> +               /*
> +                * if we found id in the static table, we must fulfill the
> +                * matching flags (i.e. if PCI_ID_F_DRIVER_OVERRIDE flag is
> +                * set, driver_override should be provided).
> +                */
> +               bool is_driver_override =
> +                       (found_id->flags & PCI_ID_F_DRIVER_OVERRIDE) != 0;
> +               if ((is_driver_override && !dev->driver_override) ||

b) A static ID match fails if the driver provides an override flag and
the device does not have an override set, or...

> +                   (dev->driver_override && !is_driver_override))

c) The device has an override set and the driver does not support the
override flag.

> +                       return NULL;
> +       } else if (dev->driver_override) {
> +               /*
> +                * if we didn't find suitable id in the static table,
> +                * driver_override will still , send a dummy id
> +                */
>                 found_id = &pci_device_id_any;
> +       }
>
>         return found_id;
>  }
>
>
> dynamic ids (new_id) works as before.
>
> Old driver_override works as before.

This is deceptively complicated, but no, I don't believe it does. By
my understanding of c) an "old" driver can no longer use
driver_override for binding a known device. It seems that if we have a
static ID match, then we cannot have a driver_override set for the
device in such a case. This is a userspace regression.

> For "new" driver_override we must fulfill the new rules.

For override'able drivers, the static table is almost useless other
than using it for modules.alias support and potentially to provide
driver_data. As above, I find this all pretty confusing and I'd advise
trying to write a concise set of rules outlining the behavior of
driver_override vs dynamic IDs vs static IDs vs "override'able" driver
flags. I tried, I can't, it's convoluted and full of exceptions.
Thanks,

Alex