Re: [PATCH RESEND net-next 0/5] Improve the taprio qdisc's relationship with its children

From: Vladimir Oltean
Date: Mon Jun 05 2023 - 12:54:10 EST


Hi Jamal,

On Mon, Jun 05, 2023 at 11:44:17AM -0400, Jamal Hadi Salim wrote:
> I havent been following - but if you show me sample intended tc
> configs for both s/w and hardware offloads i can comment.

There is not much difference in usage between the 2 modes. IMO the software
data path logic is only a simulation for demonstrative purposes of what the
shaper is intended to do. If hardware offload is available, it is always
preferable. Otherwise, I'm not sure if anyone uses the pure software
scheduling mode (also without txtime assist) for a real life use case.

I was working with something like this for testing the code paths affected
by these changes:

#!/bin/bash

add_taprio()
{
local offload=$1
local extra_flags

case $offload in
true)
extra_flags="flags 0x2"
;;
false)
extra_flags="clockid CLOCK_TAI"
;;
esac

tc qdisc replace dev eno0 handle 8001: parent root stab overhead 24 taprio \
num_tc 8 \
map 0 1 2 3 4 5 6 7 \
queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
max-sdu 0 0 0 0 0 200 0 0 \
base-time 200 \
sched-entry S 80 20000 \
sched-entry S a0 20000 \
sched-entry S 5f 60000 \
$extra_flags
}

add_cbs()
{
local offload=$1
local extra_flags

case $offload in
true)
extra_flags="offload 1"
;;
false)
extra_flags=""
;;
esac

max_frame_size=1500
data_rate_kbps=20000
port_transmit_rate_kbps=1000000
idleslope=$data_rate_kbps
sendslope=$(($idleslope - $port_transmit_rate_kbps))
locredit=$(($max_frame_size * $sendslope / $port_transmit_rate_kbps))
hicredit=$(($max_frame_size * $idleslope / $port_transmit_rate_kbps))
tc qdisc replace dev eno0 parent 8001:8 cbs \
idleslope $idleslope \
sendslope $sendslope \
hicredit $hicredit \
locredit $locredit \
$extra_flags
}

# this should always fail
add_second_taprio()
{
tc qdisc replace dev eno0 parent 8001:7 taprio \
num_tc 8 \
map 0 1 2 3 4 5 6 7 \
queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
max-sdu 0 0 0 0 0 200 0 0 \
base-time 200 \
sched-entry S 80 20000 \
sched-entry S a0 20000 \
sched-entry S 5f 60000 \
clockid CLOCK_TAI
}

ip link set eno0 up

echo "Offload:"
add_taprio true
add_cbs true
add_second_taprio
mausezahn eno0 -t ip -b 00:04:9f:05:f6:27 -c 100 -p 60
sleep 5
tc -s class show dev eno0
tc qdisc del dev eno0 root

echo "Software:"
add_taprio false
add_cbs false
add_second_taprio
mausezahn eno0 -t ip -b 00:04:9f:05:f6:27 -c 100 -p 60
sleep 5
tc -s class show dev eno0
tc qdisc del dev eno0 root

> In my cursory look i assumed you wanted to go along the path of mqprio
> where nothing much happens in the s/w datapath other than requeues
> when the tx hardware path is busy (notice it is missing an
> enqueue/deque ops). In that case the hardware selection is essentially
> of a DMA ring based on skb tags. It seems you took it up a notch by
> infact having a choice of whether to have pure s/w or offload path.

Yes. Actually the original taprio design always had the enqueue()/dequeue()
ops involved in the data path, then commit 13511704f8d7 ("net: taprio
offload: enforce qdisc to netdev queue mapping") retrofitted the mqprio
model when using the "flags 0x2" argument.

If you have time to read, the discussion behind that redesign was here:
https://lore.kernel.org/netdev/20210511171829.17181-1-yannick.vignon@xxxxxxxxxxx/