Re: [PATCH v2 0/2] Simplify mtty driver and mdev core

From: Alex Williamson
Date: Tue Aug 20 2019 - 13:19:08 EST


On Tue, 20 Aug 2019 08:58:02 +0000
Parav Pandit <parav@xxxxxxxxxxxx> wrote:

> + Dave.
>
> Hi Jiri, Dave, Alex, Kirti, Cornelia,
>
> Please provide your feedback on it, how shall we proceed?
>
> Short summary of requirements.
> For a given mdev (mediated device [1]), there is one representor
> netdevice and devlink port in switchdev mode (similar to SR-IOV VF),
> And there is one netdevice for the actual mdev when mdev is probed.
>
> (a) representor netdev and devlink port should be able derive
> phys_port_name(). So that representor netdev name can be built
> deterministically across reboots.
>
> (b) for mdev's netdevice, mdev's device should have an attribute.
> This attribute can be used by udev rules/systemd or something else to
> rename netdev name deterministically.
>
> (c) IFNAMSIZ of 16 bytes is too small to fit whole UUID.
> A simple grep IFNAMSIZ in stack hints hundreds of users of IFNAMSIZ
> in drivers, uapi, netlink, boot config area and more. Changing
> IFNAMSIZ for a mdev bus doesn't really look reasonable option to me.

How many characters do we really have to work with? Your examples
below prepend various characters, ex. option-1 results in ens2f0_m10 or
enm10. Do the extra 8 or 3 characters in these count against IFNAMSIZ?

> Hence, I would like to discuss below options.
>
> Option-1: mdev index
> Introduce an optional mdev index/handle as u32 during mdev create
> time. User passes mdev index/handle as input.
>
> phys_port_name=mIndex=m%u
> mdev_index will be available in sysfs as mdev attribute for udev to
> name the mdev's netdev.
>
> example mdev create command:
> UUID=$(uuidgen)
> echo $UUID index=10
> > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create

Nit, IIRC previous discussions of additional parameters used comma
separators, ex. echo $UUID,index=10 >...

> > example netdevs:
> repnetdev=ens2f0_m10 /*ens2f0 is parent PF's netdevice */

Is the parent really relevant in the name? Tools like mdevctl are
meant to provide persistence, creating the same mdev devices on the
same parent, but that's simply the easiest policy decision. We can
also imagine that multiple parent devices might support a specified
mdev type and policies factoring in proximity, load-balancing, power
consumption, etc might be weighed such that we really don't want to
promote userspace creating dependencies on the parent association.

> mdev_netdev=enm10
>
> Pros:
> 1. mdevctl and any other existing tools are unaffected.
> 2. netdev stack, ovs and other switching platforms are unaffected.
> 3. achieves unique phys_port_name for representor netdev
> 4. achieves unique mdev eth netdev name for the mdev using
> udev/systemd extension. 5. Aligns well with mdev and netdev subsystem
> and similar to existing sriov bdf's.

A user provided index seems strange to me. It's not really an index,
just a user specified instance number. Presumably you have the user
providing this because if it really were an index, then the value
depends on the creation order and persistence is lost. Now the user
needs to both avoid uuid collision as well as "index" number
collision. The uuid namespace is large enough to mostly ignore this,
but this is not. This seems like a burden.

> Option-2: shorter mdev name
> Extend mdev to have shorter mdev device name in addition to UUID.
> such as 'foo', 'bar'.
> Mdev will continue to have UUID.
> phys_port_name=mdev_name
>
> Pros:
> 1. All same as option-1, except mdevctl needs upgrade for newer usage.
> It is common practice to upgrade iproute2 package along with the
> kernel. Similar practice to be done with mdevctl.
> 2. Newer users of mdevctl who wants to work with non_UUID names, will
> use newer mdevctl/tools. Cons:
> 1. Dual naming scheme of mdev might affect some of the existing tools.
> It's unclear how/if it actually affects.
> mdevctl [2] is very recently developed and can be enhanced for dual
> naming scheme.

I think we've already nak'ed this one, the device namespace becomes
meaningless if the name becomes just a string where a uuid might be an
example string. mdevs are named by uuid.

> Option-3: mdev uuid alias
> Instead of shorter mdev name or mdev index, have alpha-numeric name
> alias. Alias is an optional mdev sysfs attribute such as 'foo', 'bar'.
> example mdev create command:
> UUID=$(uuidgen)
> echo $UUID alias=foo
> > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create
> > example netdevs:
> examle netdevs:
> repnetdev = ens2f0_mfoo
> mdev_netdev=enmfoo
>
> Pros:
> 1. All same as option-1.
> 2. Doesn't affect existing mdev naming scheme.
> Cons:
> 1. Index scheme of option-1 is better which can number large number
> of mdevs with fewer characters, simplifying the management tool.

No better than option-1, simply a larger secondary namespace, but still
requires the user to come up with two independent names for the device.

> Option-4: extend IFNAMESZ to be 64 bytes Extended IFNAMESZ from 16 to
> 64 bytes phys_port_name=mdev_UUID_string mdev_netdev_name=enmUUID
>
> Pros:
> 1. Doesn't require mdev extension
> Cons:
> 1. netdev stack, driver, uapi, user space, boot config wide changes
> 2. Possible user space extensions who assumed name size being 16
> characters 3. Single device type demands namesize change for all
> netdev types

What about an alias based on the uuid? For example, we use 160-bit
sha1s daily with git (uuids are only 128-bit), but we generally don't
reference git commits with the full 20 character string. Generally 12
characters is recommended to avoid ambiguity. Could mdev automatically
create an abbreviated sha1 alias for the device? If so, how many
characters should we use and what do we do on collision? The colliding
device could add enough alias characters to disambiguate (we likely
couldn't re-alias the existing device to disambiguate, but I'm not sure
it matters, userspace has sysfs to associate aliases). Ex.

UUID=$(uuidgen)
ALIAS=$(echo $UUID | sha1sum | colrm 13)

Since there seems to be some prefix overhead, as I ask about above in
how many characters we actually have to work with in IFNAMESZ, maybe we
start with 8 characters (matching your "index" namespace) and expand as
necessary for disambiguation. If we can eliminate overhead in
IFNAMESZ, let's start with 12. Thanks,

Alex