Re: [RFC net-next 1/8] subdev: Introducing subdev bus

From: Greg KH
Date: Fri Mar 01 2019 - 12:00:09 EST


On Fri, Mar 01, 2019 at 04:35:46PM +0000, Parav Pandit wrote:
> Hi Greg,
>
> > -----Original Message-----
> > From: Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx>
> > Sent: Friday, March 1, 2019 1:17 AM
> > To: Parav Pandit <parav@xxxxxxxxxxxx>
> > Cc: netdev@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> > michal.lkml@xxxxxxxxxxx; davem@xxxxxxxxxxxxx; Jiri Pirko
> > <jiri@xxxxxxxxxxxx>
> > Subject: Re: [RFC net-next 1/8] subdev: Introducing subdev bus
> >
> > On Thu, Feb 28, 2019 at 11:37:45PM -0600, Parav Pandit wrote:
> > > Introduce a new subdev bus which holds sub devices created from a
> > > primary device. These devices are named as 'subdev'.
> > > A subdev is identified similarly to pci device using 16-bit vendor id
> > > and device id.
> > > Unlike PCI devices, scope of subdev is limited to Linux kernel.
> >
> > But these are limited to only PCI devices, right?
> >
> For Mellanox use case yes, its limited to PCI devices.
>
> > This sounds a lot like that ARM proposal a week or so ago that asked for
> > something like this, are you working with them to make sure your proposal
> > works for them as well? (sorry, can't find where that was announced, it was
> > online somewhere...)
> >
> We were not aware of it, mostly because we are either on net side of mailing lists (netdev, rdma, virt etc).
> ARM proposal likely on linux-kernel, I guess.
> I will lookup that proposal and surely see if both of us can use common infrastructure.
>
> > > A central entry that assigns unique subdev vendor and device id is:
> > > include/linux/subdev_ids.h enums. Enum are chosen over define macro so
> > > that two vendors do not end up with vendor id in kernel development
> > > process.
> >
> > Why not just make it dynamic with on static ids?
> >
> Can you please elaborate?
> Do you mean we should use something similar to pci_add_dynid() with enhancement to catch duplicate id addition?

I have no idea what I wrote here, sorry :)

I was trying to say something like "using an enumerated type going to
rely on a central authority for your "dynamic" bus, why is that needed
at all"?

> > > subdev bus holds subdevices of multiple devices. A typical created
> > > subdev for a PCI device in sysfs tree appears under their parent's
> > > device as using core's default device naming scheme:
> > >
> > > subdev<instance_id>.
> > > i.e.
> > > subdev0
> > > subdev1
> > >
> > > $ ls -l /sys/bus/pci/devices/0000:05:00.0 [..]
> > > drwxr-xr-x 4 root root 0 Feb 13 15:57 subvdev0
> > > drwxr-xr-x 4 root root 0 Feb 13 15:57 subvdev1
> > >
> > > Device model view:
> > > ------------------
> > > +------+ +------+ +------+
> > > |subdev| |subdev| |subdev|
> > > -----| 1 |----| 2 |-------| 3 |----------
> > > | +--|---+ +-|----+ +--|---+ |
> > > --------|----------|---subdev bus--|--------------
> > > | | |
> > > +--+----+-----+ +---+---+
> > > |pcidev | |pcidev |
> > > -----| A |-----------------| B |----------
> > > | +-------+ +-------+ |
> > > -------------------pci bus------------------------
> >
> > To be clear, "subdev bus" is just a logical grouping, there is no physical
> > backing "bus" here at all, right?
> >
> Yep. that's correct.
>
> > What is going to "bind" to subdev devices? PCI drivers? Or new types of
> > drivers?
> >
> Devices are placed on subdev bus using devlink interface. And drivers which registers using subdev_register_driver(), their probe() method will be called.

But it's just a virtual mapping, what "good" does this provide anyone?
You are still sharing the same backing device here, what does this
logical split buy you?

> So yes, those are PCI vendor driver.
> I tried to capture this in cover-letter.
> At present users didn't ask to map this subdev to VM, but there is very high chance that once we have this without PCI SR-IOV, they would like to extend to VMs too.
> So in that case devlink will have option to say, add 'passthrough' device, and in that case instead of vendor's pci driver, some high level vfio type driver will bind to it.
> That is just the anticipation, but we haven't really worked out this fully.
> But this model allows to do so.

I think mfd is what you want to do here, instead of creating your own
bus type.

> > > +int subdev_add_dev(struct subdev *subdev, struct device *parent_dev,
> > > + enum subdev_vendor_id vid, enum subdev_device_id did) {
> > > + u32 id = 0;
> > > + int ret;
> > > +
> > > + if (!parent_dev)
> > > + return -EINVAL;
> >
> > No root devices?
> >
> I didn't get the comment. Intent of this check is subdev must have parent. Parent type doesn't matter.

You do not allow a subdev to sit at the "root" of the device tree.
That's fine, it was just a comment, it's your choice.

thanks,

greg k-h