RE: [PATCH V3 net-next] net: fec: add XDP_TX feature support

From: Wei Fang
Date: Tue Aug 08 2023 - 12:59:08 EST


> > For XDP_REDIRECT, the performance show as follow.
> > root@imx8mpevk:~# ./xdp_redirect eth1 eth0 Redirecting from eth1
> > (ifindex 3; driver st_gmac) to eth0 (ifindex 2; driver fec)
>
> This is not exactly the same as XDP_TX setup as here you choose to redirect
> between eth1 (driver st_gmac) and to eth0 (driver fec).
>
> I would like to see eth0 to eth0 XDP_REDIRECT, so we can compare to
> XDP_TX performance.
> Sorry for all the requests, but can you provide those numbers?
>

Oh, sorry, I thought what you wanted were XDP_REDIRECT results for different
NICs. Below is the result of XDP_REDIRECT on the same NIC.
root@imx8mpevk:~# ./xdp_redirect eth0 eth0
Redirecting from eth0 (ifindex 2; driver fec) to eth0 (ifindex 2; driver fec)
Summary 232,302 rx/s 0 err,drop/s 232,344 xmit/s
Summary 234,579 rx/s 0 err,drop/s 234,577 xmit/s
Summary 235,548 rx/s 0 err,drop/s 235,549 xmit/s
Summary 234,704 rx/s 0 err,drop/s 234,703 xmit/s
Summary 235,504 rx/s 0 err,drop/s 235,504 xmit/s
Summary 235,223 rx/s 0 err,drop/s 235,224 xmit/s
Summary 234,509 rx/s 0 err,drop/s 234,507 xmit/s
Summary 235,481 rx/s 0 err,drop/s 235,482 xmit/s
Summary 234,684 rx/s 0 err,drop/s 234,683 xmit/s
Summary 235,520 rx/s 0 err,drop/s 235,520 xmit/s
Summary 235,461 rx/s 0 err,drop/s 235,461 xmit/s
Summary 234,627 rx/s 0 err,drop/s 234,627 xmit/s
Summary 235,611 rx/s 0 err,drop/s 235,611 xmit/s
Packets received : 3,053,753
Average packets/s : 234,904
Packets transmitted : 3,053,792
Average transmit/s : 234,907
>
> I'm puzzled that moving the MMIO write isn't change performance.
>
> Can you please verify that the packet generator machine is sending more
> frame than the system can handle?
>
> (meaning the pktgen_sample03_burst_single_flow.sh script fast enough?)
>

Thanks very much!
You remind me, I always started the pktgen script first and then ran the xdp2
program in the previous tests. So I saw the transmit speed of the generator
was always greater than the speed of XDP_TX when I stopped the script. But
actually, the real-time transmit speed of the generator was degraded to as
equal to the speed of XDP_TX.

So I turned off the rx function of the generator in case of increasing the CPU
loading of the generator due to the returned traffic from xdp2. And I tested
the performance again. Below are the results.

Result 1: current method
root@imx8mpevk:~# ./xdp2 eth0
proto 17: 326539 pkt/s
proto 17: 326464 pkt/s
proto 17: 326528 pkt/s
proto 17: 326465 pkt/s
proto 17: 326550 pkt/s

Result 2: sync_dma_len method
root@imx8mpevk:~# ./xdp2 eth0
proto 17: 353918 pkt/s
proto 17: 352923 pkt/s
proto 17: 353900 pkt/s
proto 17: 352672 pkt/s
proto 17: 353912 pkt/s

Note: the speed of the generator is about 935397pps.

Compared result 1 with result 2. The "sync_dma_len" method actually improves
the performance of XDP_TX, so the conclusion from the previous tests is *incorrect*.
I'm so sorry for that. :(

In addition, I also tried the "dma_sync_len" + not use xdp_convert_buff_to_frame()
method, the performance has been further improved. Below is the result.

Result 3: sync_dma_len + not use xdp_convert_buff_to_frame() method
root@imx8mpevk:~# ./xdp2 eth0
proto 17: 369261 pkt/s
proto 17: 369267 pkt/s
proto 17: 369206 pkt/s
proto 17: 369214 pkt/s
proto 17: 369126 pkt/s

Therefore, I'm intend to use the "dma_sync_len"+ not use xdp_convert_buff_to_frame()
method in the V5 patch. Thank you again, Jesper and Jakub. You really helped me a lot. :)