Re: [PATCH v2 net 1/3] net: dsa: felix: tc-taprio intervals smaller than MTU should send at least one packet

From: Vladimir Oltean
Date: Mon Sep 05 2022 - 20:12:14 EST


On Tue, Sep 06, 2022 at 12:53:20AM +0200, Michael Walle wrote:
> I haven't looked at the overall code, but the solution described
> above sounds good.
>
> FWIW, I don't think such a schedule, where exactly one frame
> can be sent, is very likely in the wild though. Imagine a piece
> of software is generating one frame per cycle. It might happen
> that during one (hardware) cycle there is no frame ready (because
> it is software and it jitters), but then in the next cycle, there
> are now two frames ready. In that case you'll always lag one frame
> behind and you'll never recover from it.
>
> Either I'd make sure I can send at two frames in one cycle, or
> my software would only send a frame every other cycle.

A 10 us interval is a 10 us interval, it shouldn't matter if you slice
it up as one 1250B frame, or two 500B frames, or four 200B frames, etc.
Except with the Microchip hardware implementation, it does. In v1, we
were slicing the 10 us interval in half for useful traffic and half for
the guard band. So we could fit more small packets in 5 us. In v2, at
your proposal, we are slicing it in 33 ns for the useful traffic, and
10 us - 33 ns for the guard band. This indeed allows for a single
packet, be it big or small. It's how the hardware works; without any
other input data point, a slicing point needs to be put somewhere.
Somehow it's just as arbitrary in v2 as where it was in v1, just
optimized for a different metric which you're now saying is less practical.

By the way, I was a fool in last year's discussion on guard bands for
saying that there isn't any way for the user to control per-tc MTU.
IEEE 802.1Qbv, later standardized as IEEE 802.1Q clause 8.6.8.4
Enhancements for scheduled traffic, does contain a queueMaxSDUTable
structure with queueMaxSDU elements. I guess I have no choice except to
add this to the tc-taprio UAPI in a net-next patch, because as I've
explained above, even though I've solved the port hanging issue, this
hardware needs more fine tuning to obtain a differentiation between many
small packets vs few large packets per interval.