Re: [PATCH v2] tcp: fix connection reset due to tw hashdance race.

From: Eric Dumazet
Date: Thu Jun 08 2023 - 07:54:26 EST


On Thu, Jun 8, 2023 at 1:24 PM Duan,Muquan <duanmuquan@xxxxxxxxx> wrote:
>
> Besides trying to find the right tw sock, another idea is that if FIN segment finds listener sock, just discard the segment, because this is obvious a bad case, and the peer will retransmit it. Or for FIN segment we only look up in the established hash table, if not found then discard it.
>

Sure, please give the RFC number and section number that discusses
this point, and then we might consider this.

Just another reminder about TW : timewait sockets are "best effort".

Their allocation can fail, and /proc/sys/net/ipv4/tcp_max_tw_buckets
can control their number to 0

Applications must be able to recover gracefully if a 4-tuple is reused too fast.

>
> 2023年6月8日 下午12:13,Eric Dumazet <edumazet@xxxxxxxxxx> 写道:
>
> On Thu, Jun 8, 2023 at 5:59 AM Duan,Muquan <duanmuquan@xxxxxxxxx> wrote:
>
>
> Hi, Eric,
>
> Thanks a lot for your explanation!
>
> Even if we add reader lock, if set the refcnt outside spin_lock()/spin_unlock(), during the interval between spin_unlock() and refcnt_set(), other cpus will see the tw sock with refcont 0, and validation for refcnt will fail.
>
> A suggestion, before the tw sock is added into ehash table, it has been already used by tw timer and bhash chain, we can firstly add refcnt to 2 before adding two to ehash table,. or add the refcnt one by one for timer, bhash and ehash. This can avoid the refcont validation failure on other cpus.
>
> This can reduce the frequency of the connection reset issue from 20 min to 180 min for our product, We may wait quite a long time before the best solution is ready, if this obvious defect is fixed, userland applications can benefit from it.
>
> Looking forward to your opinions!
>
>
> Again, my opinion is that we need a proper fix, not work arounds.
>
> I will work on this a bit later.
>
> In the meantime you can apply locally your patch if you feel this is
> what you want.
>
>