Re: [PATCH net v2] r8169: fix rtl8125b PAUSE frames blasting when suspended

From: Jacob Keller
Date: Thu Nov 30 2023 - 17:25:17 EST




On 11/29/2023 3:40 PM, Grant Grundler wrote:
> On Wed, Nov 29, 2023 at 3:05 PM Jacob Keller <jacob.e.keller@xxxxxxxxx> wrote:
>> On 11/29/2023 7:53 AM, ChunHao Lin wrote:
>>> When FIFO reaches near full state, device will issue pause frame.
>>> If pause slot is enabled(set to 1), in this time, device will issue
>>> pause frame only once. But if pause slot is disabled(set to 0), device
>>> will keep sending pause frames until FIFO reaches near empty state.
>>>
>>> When pause slot is disabled, if there is no one to handle receive
>>> packets, device FIFO will reach near full state and keep sending
>>> pause frames. That will impact entire local area network.
>>>
>>> This issue can be reproduced in Chromebox (not Chromebook) in
>>> developer mode running a test image (and v5.10 kernel):
>>> 1) ping -f $CHROMEBOX (from workstation on same local network)
>>> 2) run "powerd_dbus_suspend" from command line on the $CHROMEBOX
>>> 3) ping $ROUTER (wait until ping fails from workstation)
>>>
>>> Takes about ~20-30 seconds after step 2 for the local network to
>>> stop working.
>>>
>>> Fix this issue by enabling pause slot to only send pause frame once
>>> when FIFO reaches near full state.
>>>
>>
>> Makes sense. Avoiding the spam is good. The naming is a bit confusing
>> but I guess that comes from realtek datasheet?
>
> I don't know. It doesn't matter to me what it's called since I don't
> have access to the data sheet anyway. :/
>

The name is fine, i just found it a bit hard to parse since its
effectively "PAUSE_SLOT_ON" makes us *not* send pause frames forever.

I think its fine as-is, since this is referring to the use of the pause
slot in hardware.

>>> Fixes: f1bce4ad2f1c ("r8169: add support for RTL8125")
>>> Reported-by: Grant Grundler <grundler@xxxxxxxxxxxx>
>>> Tested-by: Grant Grundler <grundler@xxxxxxxxxxxx>
>>> Cc: stable@xxxxxxxxxxxxxxx
>>> Signed-off-by: ChunHao Lin <hau@xxxxxxxxxxx>
>>> ---
>>> v2:
>>> - update comment and title.
>>> ---
>>> drivers/net/ethernet/realtek/r8169_main.c | 7 ++++++-
>>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
>>> index 62cabeeb842a..bb787a52bc75 100644
>>> --- a/drivers/net/ethernet/realtek/r8169_main.c
>>> +++ b/drivers/net/ethernet/realtek/r8169_main.c
>>> @@ -196,6 +196,7 @@ enum rtl_registers {
>>> /* No threshold before first PCI xfer */
>>> #define RX_FIFO_THRESH (7 << RXCFG_FIFO_SHIFT)
>>> #define RX_EARLY_OFF (1 << 11)
>>> +#define RX_PAUSE_SLOT_ON (1 << 11) /* 8125b and later */
>>
>> This confuses me though: RX_EARLY_OFF is (1 << 11) as well.. Is that
>> from a different set of devices?
>
> Yes, for a different HW version of the device.
>

Great.

>> We're writing to the same register
>> RxConfig here I think in both cases?
>
> Yes. But to different versions of the HW which use this bit
> differently. Ergo the comment about "8125b and later".
>
>> Can you clarify if these are supposed to be the same bit?
>
> Yes, they are the same bit - but different versions of HW use BIT(11)
> differently.

Thanks for the clarification!

>
>>
>>> #define RXCFG_DMA_SHIFT 8
>>> /* Unlimited maximum PCI burst. */
>>> #define RX_DMA_BURST (7 << RXCFG_DMA_SHIFT)
>>> @@ -2306,9 +2307,13 @@ static void rtl_init_rxcfg(struct rtl8169_private *tp)
>>> case RTL_GIGA_MAC_VER_40 ... RTL_GIGA_MAC_VER_53:
>>> RTL_W32(tp, RxConfig, RX128_INT_EN | RX_MULTI_EN | RX_DMA_BURST | RX_EARLY_OFF);
>>> break;
>>> - case RTL_GIGA_MAC_VER_61 ... RTL_GIGA_MAC_VER_63:
>>> + case RTL_GIGA_MAC_VER_61:
>>> RTL_W32(tp, RxConfig, RX_FETCH_DFLT_8125 | RX_DMA_BURST);
>>> break;
>>
>> I assume there isn't a VER_62 between these?
>
> Correct. My clue is this code near the top of this file:
>
> 149 [RTL_GIGA_MAC_VER_61] = {"RTL8125A", FIRMWARE_8125A_3},
> 150 /* reserve 62 for CFG_METHOD_4 in the vendor driver */
> 151 [RTL_GIGA_MAC_VER_63] = {"RTL8125B", FIRMWARE_8125B_2},
>
>>
>>> + case RTL_GIGA_MAC_VER_63:
>>> + RTL_W32(tp, RxConfig, RX_FETCH_DFLT_8125 | RX_DMA_BURST |
>>> + RX_PAUSE_SLOT_ON);
>>
>> We add RX_PAUSE_SLOT_ON now for RTL_GIGA_MAC_VER_63 in addition. Makes
>> sense.
>
> Exactly.
>
> thanks for reviewing!
>

Great.

For the record:

Reviewed-by: Jacob Keller <jacob.e.keller@xxxxxxxxx>