Re: wifi: rtw88: question about SDIO RX aggregation limiting

From: Martin Blumenstingl
Date: Mon Jul 03 2023 - 17:25:42 EST


Hello Ping-Ke,

sorry again for the long waiting time. I'll be quicker next time.

On Tue, Jun 20, 2023 at 7:26 AM Ping-Ke Shih <pkshih@xxxxxxxxxxx> wrote:
[...]
> > > The unit of BIT_RXDMA_AGG_PG_TH is 1k bytes, so I think you can
> > > set mmc_host->max_req_size/1024.
> > I tried this but I got a result that I don't understand.
> > I've been testing with three BIT_RXDMA_AGG_PG_TH values on a SoC that
> > can handle 255 * 1024 bytes. Each time I connected to the same AP and
> > downloaded a bigger file over http(s).
> > BIT_RXDMA_AGG_PG_TH: biggest observed rx_len in rtw_sdio_rxfifo_recv()
> > 255: 20968
> > 6: 5122
> > 1: 1602
>
> Please also print out number of packets you receive, and then we can see how
> many packets aggregate.
sure - here are the results:
BIT_RXDMA_AGG_PG_TH: biggest observed rx_len in rtw_sdio_rxfifo_recv()
/ number of (aggregated) packets
255: 20824 / 12
6: 5128 / 4
1: 3132 / 1 (these were a few exceptions and I'm not able to reliably
reproduce it, 1602 / 1 is what I can easily reproduce)

> > The biggest rx_len I have observed for BIT_RXDMA_AGG_PG_TH 1 looks suspicious:
> > My understanding is that I shouldn't be seeing rx_len larger than
> > BIT_RXDMA_AGG_PG_TH * 1024.
> > BIT_RXDMA_AGG_PG_TH = 6 is within this limit but BIT_RXDMA_AGG_PG_TH =
> > 1 isn't (I'm seeing 578 extra bytes in addition to the 1024 bytes that
> > I was expecting).
>
> Assume threshold is 1k, and single one packet is larger than 1k. Hardware
> will not split it into two. Also, please make sure 0x280[29] BIT_EN_PRE_CALC
> is 1. Otherwise, it will possibly aggregate additional one packet to over
> the threshold.
>From the numbers above it seems most likely that we're hitting the
"one packet is larger than 1k" case.

Also I'm seeing:
wlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
state UP group default qlen 1000
My interface's MTU is 1500 bytes. Seeing 1602 bytes rx_len with one
packet is already odd (that would mean 102 bytes for overhead like RX
descriptor and other headers/metadata). But 3132 bytes rx_len is very
odd.

BIT_EN_PRE_CALC is set, see also the attached diff (it's not meant to
be applied anywhere - it's just so you understand what I've been
testing with).

> 0x280[15:8] is timeout time in unit of 1us for SDIO interface. When set
> threshold to 255, you can enlarge this to see if it can aggregate more as
> expected.
I did not experiment with this yet as I'd like to understand the above
findings first.


Best regards,
Martin
diff --git a/drivers/net/wireless/realtek/rtw88/sdio.c b/drivers/net/wireless/realtek/rtw88/sdio.c
index 2c1fb2dabd40..0bb872ef1296 100644
--- a/drivers/net/wireless/realtek/rtw88/sdio.c
+++ b/drivers/net/wireless/realtek/rtw88/sdio.c
@@ -654,23 +654,35 @@ static void rtw_sdio_init(struct rtw_dev *rtwdev)

static void rtw_sdio_enable_rx_aggregation(struct rtw_dev *rtwdev)
{
- u8 size, timeout;
+ struct rtw_sdio *rtwsdio = (struct rtw_sdio *)rtwdev->priv;
+ struct mmc_host *host = rtwsdio->sdio_func->card->host;
+ unsigned int host_max_pages;
+ u8 chip_max_pages, timeout;

if (rtw_chip_wcpu_11n(rtwdev)) {
- size = 0x6;
+ chip_max_pages = 0x6;
timeout = 0x6;
} else {
- size = 0xff;
+ chip_max_pages = 0xff;
timeout = 0x1;
}

+ host_max_pages = host->max_req_size / SZ_1K;
+
+ rtw_err(rtwdev, "Max RX pages - chip aggregation limit: %u, host controller limit: %u\n", chip_max_pages, host_max_pages);// HACK
+ rtw_dbg(rtwdev, RTW_DBG_SDIO,
+ "Max RX pages - chip aggregation limit: %u, host controller limit: %u\n",
+ chip_max_pages, host_max_pages);
+
/* Make the firmware honor the size limit configured below */
rtw_write32_set(rtwdev, REG_RXDMA_AGG_PG_TH, BIT_EN_PRE_CALC);

rtw_write8_set(rtwdev, REG_TXDMA_PQ_MAP, BIT_RXDMA_AGG_EN);

rtw_write16(rtwdev, REG_RXDMA_AGG_PG_TH,
- FIELD_PREP(BIT_RXDMA_AGG_PG_TH, size) |
+ FIELD_PREP(BIT_RXDMA_AGG_PG_TH, min_t(unsigned int,
+ chip_max_pages,
+ host_max_pages)) |
FIELD_PREP(BIT_DMA_AGG_TO_V1, timeout));

rtw_write8_set(rtwdev, REG_RXDMA_MODE, BIT_DMA_MODE);
@@ -936,6 +948,7 @@ static void rtw_sdio_rxfifo_recv(struct rtw_dev *rtwdev, u32 rx_len)
{
struct rtw_sdio *rtwsdio = (struct rtw_sdio *)rtwdev->priv;
const struct rtw_chip_info *chip = rtwdev->chip;
+ unsigned int num_pkt = 0, orig_rx_len = rx_len;
u32 pkt_desc_sz = chip->rx_pkt_desc_sz;
struct ieee80211_rx_status rx_status;
struct rtw_rx_pkt_stat pkt_stat;
@@ -974,6 +987,7 @@ static void rtw_sdio_rxfifo_recv(struct rtw_dev *rtwdev, u32 rx_len)
*/
rtw_sdio_rx_skb(rtwdev, skb, pkt_offset, &pkt_stat,
&rx_status);
+ num_pkt++;
break;
}

@@ -981,6 +995,7 @@ static void rtw_sdio_rxfifo_recv(struct rtw_dev *rtwdev, u32 rx_len)
if (!split_skb) {
rtw_sdio_rx_skb(rtwdev, skb, pkt_offset, &pkt_stat,
&rx_status);
+ num_pkt++;
break;
}

@@ -989,11 +1004,14 @@ static void rtw_sdio_rxfifo_recv(struct rtw_dev *rtwdev, u32 rx_len)

rtw_sdio_rx_skb(rtwdev, split_skb, pkt_offset, &pkt_stat,
&rx_status);
+ num_pkt++;

/* Move to the start of the next RX descriptor */
skb_reserve(skb, curr_pkt_len);
rx_len -= curr_pkt_len;
}
+
+ pr_err("%s(%u) - number of packets: %u\n", __func__, orig_rx_len, num_pkt);// HACK
}

static void rtw_sdio_rx_isr(struct rtw_dev *rtwdev)