Re: This is the fourth time I’ve tried to find what led to the regression of outgoing network speed and each time I find the merge commit 8c94ccc7cd691472461448f98e2372c75849406c

From: Mathias Nyman
Date: Thu Feb 08 2024 - 04:28:48 EST


On 7.2.2024 13.55, Mikhail Gavrilov wrote:
On Wed, Feb 7, 2024 at 3:39 PM Mathias Nyman
<mathias.nyman@xxxxxxxxxxxxxxx> wrote:

Thanks,

Looks like your network adapter ends up interrupting CPU0 in the bad case due
to the change in how many interrupts are requested by xhci_hcd before it.

bad case:
CPU0 CPU1 ... CPU31
87: 18213809 0 ... 0 IR-PCI-MSIX-0000:0e:00.0 0-edge enp14s0

Does manually changing it to some other CPU help? picking one that doesn't already
handle a lot of interrupts. CPU0 could also in general be more busy, possibly spending
more time with interrupts disabled.

For example change to CPU23 in the bad case:

echo 800000 > /proc/irq/87/smp_affinity

Check from proc/interrupts that enp14s0 interrupts actually go to CPU23 after this.

Thanks
Mathias


root@secondary-ws ~# iperf3 -c primary-ws.local -t 5 -p 5000 -P 1
Connecting to host primary-ws.local, port 5000
[ 5] local 192.168.1.130 port 49152 connected to 192.168.1.96 port 5000
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 70.9 MBytes 594 Mbits/sec 0 376 KBytes
[ 5] 1.00-2.00 sec 72.4 MBytes 607 Mbits/sec 0 431 KBytes
[ 5] 2.00-3.00 sec 73.1 MBytes 613 Mbits/sec 0 479 KBytes
[ 5] 3.00-4.00 sec 72.4 MBytes 607 Mbits/sec 0 501 KBytes
[ 5] 4.00-5.00 sec 73.2 MBytes 614 Mbits/sec 0 501 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-5.00 sec 362 MBytes 607 Mbits/sec 0 sender
[ 5] 0.00-5.00 sec 360 MBytes 603 Mbits/sec receiver

iperf Done.
root@secondary-ws ~# echo 800000 > /proc/irq/87/smp_affinity
root@secondary-ws ~# iperf3 -c primary-ws.local -t 5 -p 5000 -P 1
Connecting to host primary-ws.local, port 5000
[ 5] local 192.168.1.130 port 37620 connected to 192.168.1.96 port 5000
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 111 MBytes 934 Mbits/sec 0 621 KBytes
[ 5] 1.00-2.00 sec 109 MBytes 913 Mbits/sec 0 621 KBytes
[ 5] 2.00-3.00 sec 110 MBytes 920 Mbits/sec 0 621 KBytes
[ 5] 3.00-4.00 sec 110 MBytes 924 Mbits/sec 0 621 KBytes
[ 5] 4.00-5.00 sec 109 MBytes 917 Mbits/sec 0 621 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-5.00 sec 549 MBytes 921 Mbits/sec 0 sender
[ 5] 0.00-5.00 sec 547 MBytes 916 Mbits/sec receiver

iperf Done.

Very interesting, is CPU0 slower than CPU23 by 30%?


My guess is that CPU0 spends more time with interrupts disabled than other CPUs.
Either because it's handling interrupts from some other hardware, or running
code that disables interrupts (for example kernel code inside spin_lock_irq),
and thus not able to handle network adapter interrupts at the same rate as CPU23

Thanks
Mathias