RE: [PATCH] hv_netvsc:Register VF in netvsc_probe if NET_DEVICE_REGISTER missed

From: Haiyang Zhang
Date: Wed Jan 31 2024 - 11:46:58 EST




> -----Original Message-----
> From: Shradha Gupta <shradhagupta@xxxxxxxxxxxxxxxxxxx>
> Sent: Wednesday, January 31, 2024 2:54 AM
> To: Dexuan Cui <decui@xxxxxxxxxxxxx>
> Cc: KY Srinivasan <kys@xxxxxxxxxxxxx>; Haiyang Zhang
> <haiyangz@xxxxxxxxxxxxx>; Wei Liu <wei.liu@xxxxxxxxxx>; David S. Miller
> <davem@xxxxxxxxxxxxx>; Eric Dumazet <edumazet@xxxxxxxxxx>; Jakub Kicinski
> <kuba@xxxxxxxxxx>; Paolo Abeni <pabeni@xxxxxxxxxx>; Wojciech Drewek
> <wojciech.drewek@xxxxxxxxx>; linux-hyperv@xxxxxxxxxxxxxxx;
> netdev@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Shradha Gupta
> <shradhagupta@xxxxxxxxxxxxx>; stable@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH] hv_netvsc:Register VF in netvsc_probe if
> NET_DEVICE_REGISTER missed
>
> On Tue, Jan 30, 2024 at 08:13:21PM +0000, Dexuan Cui wrote:
> > > From: Shradha Gupta <shradhagupta@xxxxxxxxxxxxxxxxxxx>
> > > Sent: Monday, January 29, 2024 11:19 PM
> > > [...]
> > > If hv_netvsc driver is removed and reloaded, the NET_DEVICE_REGISTER
> >
> > s/removed/unloaded/
> > unloaded looks more accurate to me :-)
> >
> > > [...]
> > > Tested-on: Ubuntu22
> > > Testcases: LISA testsuites
> > > verify_reload_hyperv_modules, perf_tcp_ntttcp_sriov
> > IMO the 3 lines can be removed: this bug is not specific to Ubuntu, and
> the
> > test case names don't provide extra value to help understand the issue
> > here and they might cause more questions unnecessarily, e.g. what's
> LISA,
> > what exactly do the test cases do.
> >
> > > +/* Macros to define the context of vf registration */
> > > +#define VF_REG_IN_PROBE 1
> > > +#define VF_REG_IN_RECV_CBACK 2
> >
> > I think VF_REG_IN_NOTIFIER is a better name?
> > RECV_CBALL looks inaccurate to me.
> >
> > > @@ -2205,8 +2209,11 @@ static int netvsc_vf_join(struct net_device
> > > *vf_netdev,
> > > ndev->name, ret);
> > > goto upper_link_failed;
> > > }
> > > -
> > > - schedule_delayed_work(&ndev_ctx->vf_takeover,
> > > VF_TAKEOVER_INT);
> > > + /* If this registration is called from probe context vf_takeover
> > > + * is taken care of later in probe itself.
> > I suspect "later in probe itself" is not accurate.
> > If 'context' is VF_REG_IN_PROBE, I suppose what happens here is:
> > after netvsc_probe() finishes, the netvsc interface becomes available,
> > so the user space will ifup it, and netvsc_open() will UP the VF
> interface,
> > and netvsc_netdev_event() is called for the VF with event ==
> > NETDEV_POST_INIT (?) and NETDEV_CHANGE, and the data path is
> > switched to the VF.
> >
> > If my understanding is correct, I think in the case of 'context' ==
> > VF_REG_IN_PROBE, I suspect the "Align MTU of VF with master"
> > and the "sync address list from ndev to VF" in __netvsc_vf_setup() are
> > omitted? If so, should this be fixed? e.g. Not sure if the below is an
> issue or not:
> > 1) a VF is bound to a NetVSC interface, and a user sets the MTUs to
> 1024.
> > 2) rmmod hv_netvsc
> > 3) modprobe hv_netvsc
> > 4) the netvsc interface uses MTU=1500 (the default), and the VF still
> uses 1024.
> >
> > > @@ -2597,6 +2604,34 @@ static int netvsc_probe(struct hv_device *dev,
> > > }
> > >
> > > list_add(&net_device_ctx->list, &netvsc_dev_list);
> > > +
> > > + /* When the hv_netvsc driver is removed and readded, the
> >
> > s/removed and readded/unloaded and reloaded/
> >
> > > + * NET_DEVICE_REGISTER for the vf device is replayed before
> > > probe
> > > + * is complete. This is because register_netdevice_notifier() gets
> > > + * registered before vmbus_driver_register() so that callback func
> > > + * is set before probe and we don't miss events like
> > > NETDEV_POST_INIT
> > > + * So, in this section we try to register each matching
> >
> > Looks like there are spaces at the end of the line. We can move up a
> few words
> > from the next line :-)
> >
> > s/each matching/the matching/
> > A NetVSC interface has only 1 matching VF device.
> >
> > > + * vf device that is present as a netdevice, knowing that it's
> register
> >
> > s/it's/its/
> >
> > > + * call is not processed in the netvsc_netdev_notifier(as probing
> is
> > > + * progress and get_netvsc_byslot fails).
> > > + */
> > > + for_each_netdev(dev_net(net), vf_netdev) {
> > > + if (vf_netdev->netdev_ops == &device_ops)
> > > + continue;
> > > +
> > > + if (vf_netdev->type != ARPHRD_ETHER)
> > > + continue;
> > > +
> > > + if (is_vlan_dev(vf_netdev))
> > > + continue;
> > > +
> > > + if (netif_is_bond_master(vf_netdev))
> > > + continue;
> >
> > The code above is duplicated from netvsc_netdev_event().
> > Can we add a new helper function is_matching_vf() to avoid the
> duplication?
> Sure, I will do that. Thanks
> >
> > > + netvsc_prepare_bonding(vf_netdev);
> > > + netvsc_register_vf(vf_netdev, VF_REG_IN_PROBE);
> > > + __netvsc_vf_setup(net, vf_netdev);
> >
> > add a "break;' ?
> With MANA devices and multiport support there, the individual ports are
> also net_devices.
> Wouldn't this be needed for such scenario(where we have multiple mana
> port net devices) to
> register them all?

Each device has separate probe() call, so only one VF will match in one
netvsc_probe().

netvsc_prepare_bonding() & netvsc_register_vf() have
get_netvsc_byslot(vf_netdev), but __netvsc_vf_setup() doesn't have. So,
in case of multi-Vfs, this code will run "this" netvsc NIC with multiple VFs by
__netvsc_vf_setup() which isn't correct.

You need to add the following lines before netvsc_prepare_bonding(vf_netdev)
in netvsc_probe() to skip non-matching VFs:

if (net != get_netvsc_byslot(vf_netdev))
continue;

Thanks,
- Haiyang