Re: [PATCH] tcp: fix race condition when creating child sockets from syncookies

From: Eric Dumazet
Date: Fri Oct 23 2020 - 11:56:21 EST


On Fri, Oct 23, 2020 at 5:51 PM Ricardo Dias <rdias@xxxxxxxxxx> wrote:
>
> On Fri, Oct 23, 2020 at 04:03:27PM +0200, Eric Dumazet wrote:
> > On Fri, Oct 23, 2020 at 1:14 PM Ricardo Dias <rdias@xxxxxxxxxx> wrote:
> > >
> > > When the TCP stack is in SYN flood mode, the server child socket is
> > > created from the SYN cookie received in a TCP packet with the ACK flag
> > > set.
> > >
> > ...
> >
> > This patch only handles IPv4, unless I am missing something ?
>
> Yes, currently the patch only handles IPv4. I'll improve it to also
> handle the IPv6 case.
>
> >
> > It looks like the fix should be done in inet_ehash_insert(), not
> > adding yet another helper in TCP.
> > This would be family generic.
>
> Ok, sounds good as long as there is not problem in changing the
> signature and semantics of the inet_ehash_insert() function, as well as
> changing the inet_ehash_nolisten() function.
>
> >
> > Note that normally, all packets for the same 4-tuple should be handled
> > by the same cpu,
> > so this race is quite unlikely to happen in standard setups.
>
> I was able to write a small client/server program that used the
> loopback interface to create connections, which could hit the race
> condition in 1/200 runs.
>
> The server when accepts a connection sends an 8 byte identifier to
> the client, and then waits for the client to echo the same identifier.
> The client creates hundreds of simultaneous connections to the server,
> and in each connection it sends one byte as soon as the connection is
> established, then reads the 8 byte identifier from the server and sends
> it back to the server.
>
> When we hit the race condition, one of the server connections gets an 8
> byte identifier different from its own identifier.

That is on loopback, right ?

A server under syn flood is usually hit on a physical NIC, and a NIC
will always put all packets of a TCP flow in a single RX queue.
The cpu associated with this single RX queue won't process two packets
in parallel.

Note this issue is known, someone tried to fix it in the past but the
attempt went nowhere.