Re: [PATCH net] tcp: note that tcp_rmem[1] has a limited range

From: Ivan Babrou
Date: Thu Jan 06 2022 - 17:42:08 EST


On Thu, Jan 6, 2022 at 12:25 AM Eric Dumazet <edumazet@xxxxxxxxxx> wrote:

> Just to clarify, normal TCP 3WHS has a final ACK packet, where window
> scaling is enabled.

Correct, yet this final ACK packet won't signal the initial scaled
window above 64k. That's what I'm trying to document, as it seems like
a useful thing to keep in mind. If this statement is incorrect, then
I'm definitely missing something very basic. Let me know if that's the
case.

> You describe a possible issue of passive connections.
> Most of the time, servers want some kind of control before allowing a
> remote peer to send MB of payload in the first round trip.

Let's focus purely on the client side of it. The client is willing to
receive the large payload (let's say 250K), yet it cannot signal this
fact to the server.

> However, a typical connection starts with IW10 (rfc 6928), and
> standard TCP congestion
> control would implement Slow Start, doubling the payload at every round trip,
> so this is not an issue.

It's not an issue on a low latency link, but when a latency sensitive
client is trying to retrieve something across a 300ms RTT link, extra
round trips to stretch the window add a lot of latency.

> If you want to enable bigger than 65535 RWIN for passive connections,
> this would violate standards and should be discussed first at IETF.

I understand this and I don't intend to do this.

> If you want to enable bigger than 65535 RWIN for passive connections
> in a controlled environment, I suggest using an eBPF program to do so.

Right, ebpf was your suggestion: https://lkml.org/lkml/2021/12/22/668

The intention of this patch is to say that you can't achieve this even
for active connections with the client that is willing to advertise a
larger window in the first non-SYN ACK. Currently even with ebpf you
cannot do this, but I'm happy to add the support.