[PATCH v2 0/2] net/rds: RDS-TCP robustness fixes

From: Sowmini Varadhan
Date: Tue May 05 2015 - 15:23:55 EST



This patch-set contains bug fixes for state-recovery at the RDS
layer when the underlying transport is TCP and the TCP state at one
of the endpoints is reset

V2 changes: DaveM comments to reduce memory footprint, follow
NFS/RPC model where possible. Added test-case #3

Without the changes in this set, when one of the endpoints is reset,
the existing code does not correctly clean up RDS socket state for stale
connections, resulting in some unstable, timing-dependant behavior on
the wire, including an infinite exchange of 3WHs back-and-forth, and a
resulting potential to never converge RDS state.

Test cases used to verify the changes in this set are:

1. Start rds client/server applications on two participating nodes,
node1 and node2. After at least one packet has been sent (to establish
the TCP connection), restart the rds_tcp module on the client, and
now resend packets. Tcpdump should show server sending a FIN for the
"old" client port, and clean connection establishment/exchange for
the new client port.

2. At the end of step 1, restart rds srever on node2, and start client on
node1, make sure using tcpdump, 'netstat -an|grep 16385' that
packets flow correctly.

3. start RDS client/server application on two participating nodes, and
repeat steps 1 and 2, but this time, simulate node failure by doing
"ifconfig <intf> down", so no FIN is sent.

Sowmini Varadhan (2):
RDS-TCP: Always create a new rds_sock for an incoming connection.
RDS-TCP: only initiate reconnect attempt on outgoing TCP socket.

net/rds/connection.c | 17 +++++++++++++++--
net/rds/tcp_connect.c | 1 +
net/rds/tcp_listen.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 62 insertions(+), 2 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/