Re: [RFC net-next 2/3] net: dsa: qca8k: enable assisted learning on CPU port

From: Andre Valentin
Date: Tue Aug 10 2021 - 17:09:33 EST


Am 10.08.21 um 19:53 schrieb Vladimir Oltean:
> On Tue, Aug 10, 2021 at 07:27:05PM +0200, Andre Valentin wrote:
>> On Sun, Aug 08, 2021 at 1805, DENG Qingfang wrote:
>>> On Sun, Aug 08, 2021 at 01:25:55AM +0300, Vladimir Oltean wrote:
>>>> On Sat, Aug 07, 2021 at 08:07:25PM +0800, DENG Qingfang wrote:
>>>>> Enable assisted learning on CPU port to fix roaming issues.
>>>>
>>>> 'roaming issues' implies to me it suffered from blindness to MAC
>>>> addresses learned on foreign interfaces, which appears to not be true
>>>> since your previous patch removes hardware learning on the CPU port
>>>> (=> hardware learning on the CPU port was supported, so there were no
>>>> roaming issues)
>>
>> The issue is with a wifi AP bridged into dsa and previously learned
>> addresses.
>>
>> Test setup:
>> We have to wifi APs a and b(with qca8k). Client is on AP a.
>>
>> The qca8k switch in AP b sees also the broadcast traffic from the client
>> and takes the address into its fdb.
>>
>> Now the client roams to AP b.
>> The client starts DHCP but does not get an IP. With tcpdump, I see the
>> packets going through the switch (ap->cpu port->ethernet port) and they
>> arrive at the DHCP server. It responds, the response packet reaches the
>> ethernet port of the qca8k, and is not forwarded.
>>
>> After about 3 minutes the fdb entry in the qca8k on AP b is
>> "cleaned up" and the client can immediately get its IP from the DHCP server.
>>
>> I hope this helps understanding the background.
>
> How does this differ from what is described in commit d5f19486cee7
> ("net: dsa: listen for SWITCHDEV_{FDB,DEL}_ADD_TO_DEVICE on foreign
> bridge neighbors")?
>
I lost a bit, It is a bit different.

I've been also working a bit on the ipq807x device with such a switch on
OpenWRT. There is a backport of that commit. To fix the problems described
by d5f19486cee7, I enabled assisted_learning on qca8k. And it solves the
problem.

But initially, the device was unreachable until I created traffic from the device
to a client (cpu port->ethernet). At first, I did not notice this because a wifi client
with it's traffic immediately solved the issue automatically.
Later I verified this in detail.

Hopefully DENG Qingfang patches help. But I cannot proove atm.