Re: [PATCH 13/13] net: ravb: Add runtime PM support

From: claudiu beznea
Date: Fri Nov 24 2023 - 13:03:34 EST




On 23.11.2023 21:19, Sergey Shtylyov wrote:
> On 11/23/23 8:04 PM, claudiu beznea wrote:
>
> [...]
>
>>>> From: Claudiu Beznea <claudiu.beznea.uj@xxxxxxxxxxxxxx>
>>>
>>>> RZ/G3S supports enabling/disabling clocks for its modules (including
>>>> Ethernet module). For this commit adds runtime PM support which
>>>> relies on PM domain to enable/disable Ethernet clocks.
>>>
>>> That's not exactly something new in RZ/G3S. The ravb driver has unconditional
>>> RPM calls already in the probe() and remove() methods...
>>> And the sh_eth driver
>>> has RPM support since 2009...
>>>
>>>> At the end of probe ravb_pm_runtime_put() is called which will turn
>>>
>>> I'd suggest a shorter name, like ravb_rpm_put() but (looking at this function)
>>> it doesn't seem hardly needed...
>
> Does seem, sorry. :-)
>
>>>> off the Ethernet clocks (if no other request arrives at the driver).
>>>> After that if the interface is brought up (though ravb_open()) then
>>>> the clocks remain enabled until interface is brought down (operation
>>>> done though ravb_close()).
>>>>
>>>> If any request arrives to the driver while the interface is down the
>>>> clocks are enabled to serve the request and then disabled.
>>>>
>>>> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@xxxxxxxxxxxxxx>
>>>> ---
>>>> drivers/net/ethernet/renesas/ravb.h | 1 +
>>>> drivers/net/ethernet/renesas/ravb_main.c | 99 ++++++++++++++++++++++--
>>>> 2 files changed, 93 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/drivers/net/ethernet/renesas/ravb.h b/drivers/net/ethernet/renesas/ravb.h
>>>> index c2d8d890031f..50f358472aab 100644
>>>> --- a/drivers/net/ethernet/renesas/ravb.h
>>>> +++ b/drivers/net/ethernet/renesas/ravb.h
>>>> @@ -1044,6 +1044,7 @@ struct ravb_hw_info {
>>>> unsigned magic_pkt:1; /* E-MAC supports magic packet detection */
>>>> unsigned half_duplex:1; /* E-MAC supports half duplex mode */
>>>> unsigned refclk_in_pd:1; /* Reference clock is part of a power domain. */
>>>> + unsigned rpm:1; /* Runtime PM available. */
>>>
>>> No, I don't think this flag makes any sense. We should support RPM
>>> unconditionally...
>
> If RPM calls work in the probe()/remove() methods, they should work
> in the ndo_{open|stop}() methods, right?

It might depend on hardware support... E.g.

I debugged it further the issue I had with this implementation on other
SoCs and it seems we cannot do RPM for those w/o reworking way the driver
is configured.

I wiped out the RPM code from this patch and just called:

pm_runtime_put_sync(); // [1]
usleep_range(300000, 400000); // [2]
pm_runtime_get_sync(); // [3]

at the end of ravb_probe(); with this the interfaces fails to work. I
continue debugging it and interrogated CSR and this returns RESET after
[3]. I tried to switched it back to configuration mode after [3] but fails
to restore to a proper working state.

Then continued to debug it further to see what happens on the clock driver.
The clk enable/disable reaches function at [4] which sets control_regs[reg]
which is one of the System module stop control registers. Setting this
activates module standby (AFICT). Switch to reset state on Ethernet IP
might be backed by note (2) on "Operating Mode Transitions Due to Hardware"
chapter of the G1H HW manual (which I don't fully understand).

Also, the manual of G1H states from some IPs that register state is
preserved in standby mode but not for AVB.

[4]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/clk/renesas/renesas-cpg-mssr.c#n190


>
>> The reasons I've limited only to RZ/G3S are:
>> 1/ I don't have all the platforms to test it
>
> That's a usual problem with the kernel development...
>
>> 2/ on G1H this doesn't work. I tried to debugged it but I don't have a
>> platform at hand, only remotely, and is hardly to debug once the
>> ethernet fails to work: probe is working(), open is executed, PHY is
>> initialized and then TX/RX is not working... don't know why ATM.
>
> That's why we have the long bug fixing period after -rc1...

I prefer to not introduce any bug by intention.

>
> [...]
>>>> diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
>>>> index f4634ac0c972..d70ed7e5f7f6 100644
>>>> --- a/drivers/net/ethernet/renesas/ravb_main.c
>>>> +++ b/drivers/net/ethernet/renesas/ravb_main.c
>>>> @@ -145,12 +145,41 @@ static void ravb_read_mac_address(struct device_node *np,
> [...]
>>>> +}
>>>> +
>>>> static void ravb_mdio_ctrl(struct mdiobb_ctrl *ctrl, u32 mask, int set)
>>>> {
>>>> struct ravb_private *priv = container_of(ctrl, struct ravb_private,
>>>> mdiobb);
>>>> + int ret;
>>>> +
>>>> + ret = ravb_pm_runtime_get(priv);
>>>> + if (ret < 0)
>>>> + return;
>>>>
>>>> ravb_modify(priv->ndev, PIR, mask, set ? mask : 0);
>>>> +
>>>> + ravb_pm_runtime_put(priv);
>>>
>>> Hmm, does this even work? :-/ Do the MDIO bits retain the values while
>>> the AVB core is not clocked or even powered down?
>>
>> This actually is not needed. It's a leftover. I double checked with
>> mii-tools to access the device while the interface is down and the IOCTL is
>> blocked in this case by
>> https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/renesas/ravb_main.c#L2266
>
> Have you tested with ethtool as well?
>
>>> Note that the sh_eth driver has RPM calls in the {read|write}_c{22?45}()
>
> s/?/|/,
>
>>> methods which do the full register read/write while the core is powere up
>
> Powered.
>
>>> and clocked...
>>>
>>> [...]
>>>> @@ -2064,6 +2107,11 @@ static struct net_device_stats *ravb_get_stats(struct net_device *ndev)
>>>> struct ravb_private *priv = netdev_priv(ndev);
>>>> const struct ravb_hw_info *info = priv->info;
>>>> struct net_device_stats *nstats, *stats0, *stats1;
>>>> + int ret;
>>>> +
>>>> + ret = ravb_pm_runtime_get(priv);
>>>> + if (ret < 0)
>>>> + return NULL;
>>>
>>> Hm, sh_eth.c doesn't have any RPM calls in this method. Again, do
>>
>> In setups where systemd is enabled, user space calls this method in
>> different stages (e.g. at boot time or when running ifconfig ethX, even if
>> interface is down). W/o runtime resuming here the system will fail to boot.
>>
>> The other approach I wanted to take was to:
>>
>> if (!netif_running(dev))
>> return &ndev->stats;
>>
>> But I didn't choose this path as there are some counters updated to nstat
>> only in this function, e.g. nstats->tx_dropped += ravb_read(ndev, TROCR);
>> and wanted an opinion about it.
>
> Have you seen the following commit (that I've already posted for you on
> IRC)?
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7fa2955ff70ce4532f144d26b8a087095f9c9ffc
>
> Looks like the RPM calls won't do here...
>
>>> the hardware counters remain valid across powering the MAC core down?
>>
>> The power domain that the Ethernet clocks of RZ/G3S belong disables the
>> clock and switches the Ethernet module to standby. There is no information
>> in HW manual that the content of registers will be lost.
>
> That's what your current PD driver does... isn't it possible that
> in some new SoCs the PD would be completely powered off?
>
> [...]
>>>> @@ -2115,11 +2165,18 @@ static void ravb_set_rx_mode(struct net_device *ndev)
>>>> {
>>>> struct ravb_private *priv = netdev_priv(ndev);
>>>> unsigned long flags;
>>>> + int ret;
>>>> +
>>>> + ret = ravb_pm_runtime_get(priv);
>>>> + if (ret < 0)
>>>> + return;
>>>
>>> Hm, sh_eth.c doesn't have any RPM calls in this method either.
>>> Does changing the promiscous mode have sense for an offlined interface?
>>
>> I've added it for scenarios when the interface is down and user tries to
>> configure it. I don't know to answer your question. W/o RPM resume here
>> user space blocks if tries to access it and interface is down. I can just
>> return if interface is down. Let me know if you prefer this way.
>
> Looking at __dev_set_rx_mode(), the method gets only called when
> (dev->flags & IFF_UP) is true -- but that contradicts your experience,
> it seems... However, looking at net/core/dev_addr_lists.c, that function
> is called from the atomic contexts, so please just return early.
>
>>> [...]
>>>> @@ -2187,6 +2244,11 @@ static int ravb_close(struct net_device *ndev)
>>>> if (info->nc_queues)
>>>> ravb_ring_free(ndev, RAVB_NC);
>>>>
>>>> + /* Note that if RPM is enabled on plaforms with ccc_gac=1 this needs to be
>>>
>>> It's "platforms". :-)
>>>
>>>> skipped and
>>>
>>> Overly long line?
>>
>> Not more than 100 chars. Do you want it to 80?
>
> Yes, it's not the code, no need to go beyond 80 cols, I think...
>
> [...]
>
> MBR, Sergey