Re: [PATCH v8 08/26] tcp: authopt: Disable via sysctl by default

From: Leonard Crestez
Date: Wed Sep 07 2022 - 13:58:40 EST


On 9/7/22 20:04, Eric Dumazet wrote:
On Wed, Sep 7, 2022 at 9:53 AM Leonard Crestez <cdleonard@xxxxxxxxx> wrote:
On 9/7/22 02:11, Eric Dumazet wrote:
On Mon, Sep 5, 2022 at 12:06 AM Leonard Crestez <cdleonard@xxxxxxxxx> wrote:

This is mainly intended to protect against local privilege escalations
through a rarely used feature so it is deliberately not namespaced.

Enforcement is only at the setsockopt level, this should be enough to
ensure that the tcp_authopt_needed static key never turns on.

No effort is made to handle disabling when the feature is already in
use.

Signed-off-by: Leonard Crestez <cdleonard@xxxxxxxxx>
---
Documentation/networking/ip-sysctl.rst | 6 ++++
include/net/tcp_authopt.h | 1 +
net/ipv4/sysctl_net_ipv4.c | 39 ++++++++++++++++++++++++++
net/ipv4/tcp_authopt.c | 25 +++++++++++++++++
4 files changed, 71 insertions(+)

diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
index a759872a2883..41be0e69d767 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -1038,10 +1038,16 @@ tcp_challenge_ack_limit - INTEGER
Note that this per netns rate limit can allow some side channel
attacks and probably should not be enabled.
TCP stack implements per TCP socket limits anyway.
Default: INT_MAX (unlimited)

+tcp_authopt - BOOLEAN
+ Enable the TCP Authentication Option (RFC5925), a replacement for TCP
+ MD5 Signatures (RFC2835).
+
+ Default: 0
+

...

+#ifdef CONFIG_TCP_AUTHOPT
+static int proc_tcp_authopt(struct ctl_table *ctl,
+ int write, void *buffer, size_t *lenp,
+ loff_t *ppos)
+{
+ int val = sysctl_tcp_authopt;

val = READ_ONCE(sysctl_tcp_authopt);

+ struct ctl_table tmp = {
+ .data = &val,
+ .mode = ctl->mode,
+ .maxlen = sizeof(val),
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
+ };
+ int err;
+
+ err = proc_dointvec_minmax(&tmp, write, buffer, lenp, ppos);
+ if (err)
+ return err;
+ if (sysctl_tcp_authopt && !val) {

READ_ONCE(sysctl_tcp_authopt)

Note that this test would still be racy, because another cpu might
change sysctl_tcp_authopt right after the read.

What meaningful races are possible here? This is a variable that changes
from 0 to 1 at most once.

Two cpus can issue writes of 0 and 1 values at the same time.

Depending on scheduling writing the 0 can 'win' the race and overwrite
the value back to 0.

This is in clear violation of the claim you are making (that the
sysctl can only go once from 0 to 1)

Not clear why anyone would attempt to write 0, maybe to ensure that it's still disabled?

But you're right that userspace CAN do that and the kernel CAN misbehave in this scenario so it would be better to just make the changes you suggested.

In theory if two processes attempt to assign "non-zero" at the same time
then one will "win" and the other will get an error but races between
userspace writing different values are possible for any sysctl. The
solution seems to be "write sysctls from a single place".

All the checks are in sockopts - in theory if the sysctl is written on
one CPU then a sockopt can still fail on another CPU until caches are
flushed. Is this what you're worried about?

In theory doing READ_ONCE might incur a slight penalty on sockopt but
not noticeable.

Not at all. There is _no_ penalty using READ_ONCE(). Unless it is done
in a loop and this prevents some compiler optimization.

Please use WRITE_ONCE() and READ_ONCE() for all sysctl values used in
TCP stack (and elsewhere)

See all the silly patches we had recently.

OK