Re: [PATCH v8 net-next 05/11] net: ethernet: am65-cpsw: cleanup TAPRIO handling

From: Roger Quadros
Date: Thu Dec 14 2023 - 08:37:07 EST




On 14/12/2023 13:23, Vladimir Oltean wrote:
> On Wed, Dec 13, 2023 at 01:07:15PM +0200, Roger Quadros wrote:
>> +static int am65_cpsw_taprio_replace(struct net_device *ndev,
>> + struct tc_taprio_qopt_offload *taprio)
>> {
>> struct am65_cpsw_common *common = am65_ndev_to_common(ndev);
>> + struct netlink_ext_ack *extack = taprio->mqprio.extack;
>> + struct am65_cpsw_port *port = am65_ndev_to_port(ndev);
>> struct am65_cpts *cpts = common->cpts;
>> - int ret = 0, tact = TACT_PROG;
>> + struct am65_cpsw_est *est_new;
>> + int ret, tact;
>>
>> - am65_cpsw_est_update_state(ndev);
>> + if (!netif_running(ndev)) {
>> + NL_SET_ERR_MSG_MOD(extack, "interface is down, link speed unknown");
>> + return -ENETDOWN;
>> + }
>
> I haven't used the runtime PM API that this driver uses. I don't know
> much about how it works. What are the rules here? By checking for

The only rule is that if network interface is down, the device might be
runtime_suspended so we need to get it back to runtime_active before any
device access.

> netif_running(), are you intending to rely on the pm_runtime_resume_and_get()
> call from ndo_open(), which is released with pm_runtime_put() at
> ndo_stop() time?

Actually, this code is already present upstream. I'm only moving it around
in this patch.

Based on the error message and looking at am65_cpsw_est_check_scheds() and
am65_cpsw_est_set_sched_list() which is called later in am65_cpsw_taprio_replace(),
both of which eventually call am65_est_cmd_ns_to_cnt() which expects valid link_speed,
my understanding is that the author intended to have a valid link_speed before
proceeding further.

Although it seems netif_running() check isn't enough to have valid link_speed
as the link could still be down even if the netif is brought up.

Another gap is that in am65_cpsw_est_link_up(), if link was down for more than 1 second
it just abruptly calls am65_cpsw_taprio_destroy().

So I think we need to do the following to improve taprio support in this driver:
1) accept taprio schedule irrespective of netif/link_speed status
2) call pm_runtime_get()/put() before any device access regardless of netif/link_speed state
3) on link_up when if have valid link_speed and taprio_schedule, apply it.
4) on link_down, destroy the taprio schedule form the controller.

But my concern is, this is a decent amount of work and I don't want to delay this series.
My original subject of this patch series was mpqrio/frame-preemption/coalescing. ;)

Can we please defer taprio enhancement to a separate series? Thanks!

>
> I see some inconsistencies I don't quite understand.
>
> am65_cpsw_nuss_ndo_slave_add_vid() checks for netif_running() then calls
> pm_runtime_resume_and_get() anyway.
>
> am65_cpsw_setup_mqprio() allows changing the offload even when the link
> is down (which is more user-friendly anyway) and performs the pm_runtime_get_sync()
> call itself.
>
>> -}

--
cheers,
-roger