Re: [PATCH net-next v4] net: ti: icssg_prueth: add TAPRIO offload support

From: Vladimir Oltean
Date: Wed Jan 17 2024 - 20:28:01 EST


On Mon, Jan 15, 2024 at 12:24:12PM +0530, MD Danish Anwar wrote:
> > I believe the intention is for this code to be run before any taprio
> > offload is added, correct? But it is possible for the user to add an
>
> Yes, the intention here is to run this code before any taprio offload is
> added.

Then it is misplaced?

> > offloaded Qdisc even while the netdev has not yet been brought up.
> > Is that case handled correctly, or will it simply result in NULL pointer
> > dereferences (tas->config_list)?
> >
>
> In that case, it will eventually result in NULL pointer dereference as
> tas->config_list will be pointing to NULL. To handle this correctly we
> can add the below check in emac_taprio_replace().
>
> if (!ndev_running(ndev)) {
> netdev_err(ndev, "Device is not running");
> return -EINVAL;
> }

What is the reason for which the device has to be running, other than
your placement of icssg_qos_tas_init()?

> >> +
> >> + cycle_time = admin_list->cycle_time - 4; /* -4ns to compensate for IEP wraparound time */
> >
> > Details? Doesn't this make the phase alignment of the schedule diverge
> > from what the user expects?
>
> 4ns is needed to compensate for IEP wraparound time. IEP is the clock
> used by ICSSG driver. IEP tick is 4ns and we adjust this 4ns whenever
> calculating cycle_time. You may refer to [1] for details on IEP driver.

What is understood by "IEP wraparound time"? Its time wraps around what?
It wraps around exactly once every taprio cycle of each port and that's
why the cycle-time is compensated, or how does that work?

> >
> >> + base_time = admin_list->base_time;
> >> + cur_time = prueth_iep_gettime(emac, &sts);
> >> +
> >> + if (base_time > cur_time)
> >> + change_cycle_count = DIV_ROUND_UP_ULL(base_time - cur_time, cycle_time);
> >> + else
> >> + change_cycle_count = 1;
> >
> > I see that the base_time is only used to calculate the number of cycles
> > relative to cur_time. Taprio users want to specify a basetime value
> > which indicates the phase alignment of the schedule. This is important
> > when the device is synchronized over PTP with other switches in the
> > network. Can you explain how is the basetime taken into consideration in
> > your implementation?
> >
>
> In this implementation base_time is used only to obtain the
> change_cycle_count and to write it to TAS_CONFIG_CHANGE_CYCLE_COUNT. In
> this implementation base_time is not used for anything else.

So there is zero granularity in the base-time beyond the number of cycles?
That is very bad, because it means the hardware cannot be used in a
practical TSN network where schedules are offset in phase to each other.
It needs to be able to be told when the schedule begins, with precision.
Not just how many cycles from now (what does 'now' even mean?).

> > Better to say what's the hardware maximum, than to report back num_entries
> > as being not supported.
> >
>
> Sure, I'll change it to below,
>
> if (taprio->num_entries > TAS_MAX_CMD_LISTS) {
> NL_SET_ERR_MSG_FMT_MOD(taprio->extack, "num_entries %ld is more than maximum supported entries %ld in taprio config\n",
> taprio->num_entries, TAS_MAX_CMD_LISTS);
> return -EINVAL;
> }

Keep in mind that NETLINK_MAX_FMTMSG_LEN is only 80 characters. Also, \n
is not needed in netlink extack messages. And indentation also looks off.

> >> +
> >> + emac_cp_taprio(taprio, est_new);
> >> + emac->qos.tas.taprio_admin = est_new;
> >> + ret = tas_update_oper_list(emac);
> >> + if (ret)
> >> + return ret;
> >> +
> >> + ret = tas_set_state(emac, TAS_STATE_ENABLE);
> >> + if (ret)
> >> + devm_kfree(&ndev->dev, est_new);
> >> +
> >> + return ret;
> >> +}
>
> Below is how the code will look like.
>
> emac->qos.tas.taprio_admin = taprio_offload_get(taprio);

emac->qos.tas.taprio_admin can also hold an old offload, which is leaked
here when assigning the new one ("tc qdisc replace dev eth0 root taprio").

> ret = tas_update_oper_list(emac);
> if (ret)
> return ret;
>
> ret = tas_set_state(emac, TAS_STATE_ENABLE);
> if (ret) {
> emac->qos.tas.taprio_admin = NULL;
> taprio_offload_free(taprio);
> }
>
> return ret;
>
> Please let me know if all of these changes looks ok, I'll resend the
> patch once you confirm. Thanks for reviewing.

Hard to say from this snippet. taprio_offload_free() will be needed from
emac_taprio_destroy() as well.