Re: [PATCH v2] tcp: fix connection reset due to tw hashdance race.

From: Eric Dumazet
Date: Thu Jun 15 2023 - 11:25:12 EST


On Thu, Jun 15, 2023 at 2:14 PM Duan,Muquan <duanmuquan@xxxxxxxxx> wrote:
>
> Hi, Eric,
>
> Could you please help check whether the following method makes sense, and
> any side effect? Thanks!
>
> If do the tw hashdance in real TCP_TIME_WAIT state with substate
> TCP_TIME_WAIT, instead of in substate TCP_FIN_WAIT2, the connection
> refused issue will not occur. The race of the lookup process for the
> new SYN segment and the tw hashdance can come to the following
> results:
> 1) get tw sock, SYN segment will be accepted via TCP_TW_SYN.
> 2) fail to get tw sock and original sock, get a listener sock,
> SYN segment will be accepted by listener sock.
> 3) get original sock, SYN segment can be discarded in further
> process after the sock is destroyed, in this case the peer will
> retransmit the SYN segment. This is a very rare case, seems no need to
> add lock for it.
>

You speak of SYN packet, while the original patch was all about FIN.

> I tested this modification in my reproducing environment,
> the connection reset issue did not occur and no performance impact
> observed.The side effect I currently can think out is that the original
> sock will live a little longer and hold some resources longer.
> The new patch is a temporary version which has a sysctl
> switch for comparing the 2 methods, and some modifications
> for statistics of the states not included.

So you are suggesting adding another hack/workaround for SYN packets,
on top of the other one ? What will follow next ?

Again, a correct fix needs to expand the scope of the ehash bucket lock.

I will handle this when time permits.