RE: [RFC net-next 0/8] Introducing subdev bus and devlink extension

From: Parav Pandit
Date: Tue Mar 05 2019 - 18:44:43 EST




> -----Original Message-----
> From: linux-kernel-owner@xxxxxxxxxxxxxxx <linux-kernel-
> owner@xxxxxxxxxxxxxxx> On Behalf Of Parav Pandit
> Sent: Tuesday, March 5, 2019 5:17 PM
> To: Kirti Wankhede <kwankhede@xxxxxxxxxx>; Jakub Kicinski
> <jakub.kicinski@xxxxxxxxxxxxx>
> Cc: Or Gerlitz <gerlitz.or@xxxxxxxxx>; netdev@xxxxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx; michal.lkml@xxxxxxxxxxx; davem@xxxxxxxxxxxxx;
> gregkh@xxxxxxxxxxxxxxxxxxx; Jiri Pirko <jiri@xxxxxxxxxxxx>
> Subject: RE: [RFC net-next 0/8] Introducing subdev bus and devlink extension
>
> Hi Kirti,
>
> > -----Original Message-----
> > From: Kirti Wankhede <kwankhede@xxxxxxxxxx>
> > Sent: Tuesday, March 5, 2019 4:40 PM
> > To: Parav Pandit <parav@xxxxxxxxxxxx>; Jakub Kicinski
> > <jakub.kicinski@xxxxxxxxxxxxx>
> > Cc: Or Gerlitz <gerlitz.or@xxxxxxxxx>; netdev@xxxxxxxxxxxxxxx; linux-
> > kernel@xxxxxxxxxxxxxxx; michal.lkml@xxxxxxxxxxx; davem@xxxxxxxxxxxxx;
> > gregkh@xxxxxxxxxxxxxxxxxxx; Jiri Pirko <jiri@xxxxxxxxxxxx>
> > Subject: Re: [RFC net-next 0/8] Introducing subdev bus and devlink
> > extension
> >
> >
> >
> > > I am novice at mdev level too. mdev or vfio mdev.
> > > Currently by default we bind to same vendor driver, but when it was
> > created as passthrough device, vendor driver won't create netdevice or
> > rdma device for it.
> > > And vfio/mdev or whatever mature available driver would bind at that
> > point.
> > >
> >
> > Using mdev framework, if you want to partition a physical device into
> > multiple logic devices, you can bind those devices to same vendor
> > driver through vfio-mdev, where as if you want to passthrough the
> > device bind it to vfio-pci. If I understand correctly, that is what you are
> looking for.
> >
> >
> We cannot bind a whole PCI device to vfio-pci, reason is, A given PCI device
> has existing protocol devices on it such as netdevs and rdma dev.
> This device is partitioned while those protocol devices exist and mlx5_core,
> mlx5_ib drivers are loaded on it.
> And we also need to connect these objects rightly to eswitch exposed by
> devlink interface (net/core/devlink.c) that supports eswitch binding, health,
> registers, parameters, ports support.
> It also supports existing PCI VFs.
>
> I donât think we want to replicate all of this again in mdev subsystem [1].
>
> [1] https://www.kernel.org/doc/Documentation/vfio-mediated-device.txt
>
> So devlink interface to migrate users from managing VFs to non_VF sub
> device is natural progression.
>
> However, in future, I believe we would be creating mediated devices on user
> request, to use mdev modules and map them to VM.
>
> Also 'mdev_bus' is created as a class and not as a bus. This limits to not use
> devlink interface whose handle is bus+device name.
>
> So one option is to change mdev from class to bus.
> devlink will create mdevs on the bus, mdev driver can probe these devices
> on host system by default.
> And if told to do passthrough, a different driver exposes them to VM.
> How feasible is this?
>
Wait, I do see a mdev bus and mdevs are created on this bus using mdev_device_create().
So how about we create mdevs on this bus using devlink, instead of sysfs?
And driver side on host gets the mdev_register_driver()->probe()?