Re: [PATCH v2 0/3] net/ncsi: Add NCSI Intel OEM command to keep PHY link up

From: Paul Fertser
Date: Tue Jul 20 2021 - 10:21:53 EST


On Tue, Jul 20, 2021 at 05:00:40PM +0300, Ivan Mikhaylov wrote:
> > While the host is booted up and fully functional it assumes it has
> > full proper control of network cards, and sometimes it really needs to
> > reset them to e.g. recover from crashed firmware. The PHY resets might
> > also make sense in certain cases, and so in general having this "link
> > up" bit set all the time might be breaking assumptions.
>
> Paul, what kind of assumption it would break?

The host OS drivers assume they can fully control PCIe network
cards. Doing anything (including inhibiting PHY resets) behind its
back might break assumptions the driver authors had. This bit in
question certainly makes the card behave in an unusual way, so no
wonder Intel didn't enable it by default.

I do not claim I know for a fact it's problematic but it doesn't feel
like "the right thing" so some edge cases might expose issues.

> Joel proposed it as DTS option which may help at runtime.

Sorry, I'm not following. If BMC is fully booted it's able to
configure NC-SI appropriately by a userspace action coordinated with
other BMC tasks. If BMC is not yet ready then we can't communicate
with it via Ethernet anyway. So I can't see when exactly is it going
to be helpful.

> Some of those commands should be applied after channel probe as I
> think including phy reset control.

Do you have any other commands in mind? So far I assumed we're
discussing just the one to mask PHY resets.

> > Ivan, so far I have an impression that the user-space solution would
> > be much easier, flexible and manageable and that there's no need for
> > this command to be in Linux at all.
>
> You may not have such things on your image with suitable env which you can rely
> on. There is smaf for mellanox which is done in the same way for example.

I can hardly imagine why an OS running on BMC would be using this code
in question and appropriate DT configuration but not having the right
means in userspace to control it. What would be the usecase?

If the network subsystem maintainers think this is a good idea, all
things considered, I'm fine with it. I210 losing link exactly at the
time when you need it (to enter the UEFI interactive menu) is
super-annoying, so probably any fix is better than none :)

Thank you for discussion.

--
Be free, use free (http://www.gnu.org/philosophy/free-sw.html) software!
mailto:fercerpav@xxxxxxxxx