Re: [RFCv2 1/9] tcp: authopt: Initial support and key management

From: Dmitry Safonov
Date: Wed Aug 11 2021 - 10:31:35 EST


On 8/11/21 9:29 AM, Leonard Crestez wrote:
> On 8/10/21 11:41 PM, Dmitry Safonov wrote:
[..]
>>> +       u32 flags;
>>> +       /* Wire identifiers */
>>> +       u8 send_id, recv_id;
>>> +       u8 alg_id;
>>> +       u8 keylen;
>>> +       u8 key[TCP_AUTHOPT_MAXKEYLEN];
>>> +       struct rcu_head rcu;
>>
>> This is unaligned and will add padding.
>
> Not clear padding matters. Moving rcu_head higher would avoid it, is
> that what you're suggesting.

Yes.

>> I wonder if it's also worth saving some bytes by something like
>> struct tcp_ao_key *key;
>>
>> With
>> struct tcp_ao_key {
>>        u8 keylen;
>>        u8 key[0];
>> };
>>
>> Hmm?
>
> This increases memory management complexity for very minor gains. Very
> few tcp_authopt_key will ever be created.

The change doesn't seem to be big, like:
--- a/net/ipv4/tcp_authopt.c
+++ b/net/ipv4/tcp_authopt.c
@@ -422,8 +422,16 @@ int tcp_set_authopt_key(struct sock *sk, sockptr_t
optval, unsig>
key_info = __tcp_authopt_key_info_lookup(sk, info, opt.local_id);
if (key_info)
tcp_authopt_key_del(sk, info, key_info);
+
+ key = sock_kmalloc(sk, sizeof(*key) + opt.keylen, GFP_KERNEL |
__GFP_ZERO);
+ if (!key) {
+ tcp_authopt_alg_release(alg);
+ return -ENOMEM;
+ }
+
key_info = sock_kmalloc(sk, sizeof(*key_info), GFP_KERNEL |
__GFP_ZERO);
if (!key_info) {
+ sock_kfree_s(sk, key, sizeof(*key) + opt.keylen);
tcp_authopt_alg_release(alg);
return -ENOMEM;
}

I don't know, probably it'll be enough for every user to limit their
keys by length of 80, but if one day it won't be enough - this ABI will
be painful to extend.

[..]
>>> +#define TCP_AUTHOPT                    38      /* TCP Authentication
>>> Option (RFC2385) */
>>> +#define TCP_AUTHOPT_KEY                39      /* TCP Authentication
>>> Option update key (RFC2385) */
>>
>> RFC2385 is md5 one.
>> Also, should there be TCP_AUTHOPT_ADD_KEY, TCP_AUTHOPT_DELTE_KEY
>> instead of using flags inside setsocketopt()? (no hard feelings)
>
> Fixed RFC reference.
>
> TCP_AUTHOPT_DELETE_KEY could be clearer, I just wanted to avoid bloating
> the sockopt space. But there's not any scarcity.
>
> For reference tcp_md5 handles key deletion based on keylen == 0. This
> seems wrong to me, an empty key is in fact valid though not realistic.
>
> If local_id is removed in favor of "full match on id and binding" then
> the delete sockopt would still need most of a full struct
> tcp_authopt_key anyway.

Sounds like a plan.

[..]>> I'm not sure what's the use of enum here, probably: > #define
>> TCP_AUTHOPT_FLAG_REJECT_UNEXPECTED BIT(2)
>
> This is an enum because it looks nice in kernel-doc. I couldn't find a
> way to attach docs to a macro and include it somewhere else.

Yeah, ok, seems like good justification.

> BTW, the enum gains more members later.
>
> As for BIT() it doesn't see to be allowed in uapi and there were recent
> changes removing such usage.

Ok, I just saw it's still used in include/uapi, but not aware of the
removal.

>
>> [..]
>>> +struct tcp_authopt_key {
>>> +       /** @flags: Combination of &enum tcp_authopt_key_flag */
>>> +       __u32   flags;
>>> +       /** @local_id: Local identifier */
>>> +       __u32   local_id;
>>> +       /** @send_id: keyid value for send */
>>> +       __u8    send_id;
>>> +       /** @recv_id: keyid value for receive */
>>> +       __u8    recv_id;
>>> +       /** @alg: One of &enum tcp_authopt_alg */
>>> +       __u8    alg;
>>> +       /** @keylen: Length of the key buffer */
>>> +       __u8    keylen;
>>> +       /** @key: Secret key */
>>> +       __u8    key[TCP_AUTHOPT_MAXKEYLEN];
>>> +       /**
>>> +        * @addr: Key is only valid for this address
>>> +        *
>>> +        * Ignored unless TCP_AUTHOPT_KEY_ADDR_BIND flag is set
>>> +        */
>>> +       struct __kernel_sockaddr_storage addr;
>>> +};
>>
>> It'll be an ABI if this is accepted. As it is - it can't support
>> RFC5925 fully.
>> Extending syscall ABI is painful. I think, even the initial ABI *must*
>> support
>> all possible features of the RFC.
>> In other words, there must be src_addr, src_port, dst_addr and dst_port.
>> I can see from docs you've written you don't want to support a mix of
>> different
>> addr/port MKTs, so you can return -EINVAL or -ENOTSUPP for any value
>> but zero.
>> This will create an ABI that can be later extended to fully support RFC.
>
> RFC states that MKT connection identifiers can be specified using ranges
> and wildcards and the details are up to the implementation. Keys are
> *NOT* just bound to a classical TCP 4-tuple.
>
> * src_addr and src_port is implicit in socket binding. Maybe in theory
> src_addr they could apply for a server socket bound to 0.0.0.0:PORT but
> userspace can just open more sockets.
> * dst_port is not supported by MD5 and I can't think of any useful
> usecase. This is either well known (179 for BGP) or auto-allocated.
> * tcp_md5 was recently enhanced allow a "prefixlen" for addr and
> "l3mdev" ifindex binding.
>
> This last point shows that the binding features people require can't be
> easily predicted in advance so it's better to allow the rules to be
> extended.

Yeah, I see both changes you mention went on easy way as they reused
existing paddings in the ABI structures.
Ok, if you don't want to reserve src_addr/src_port/dst_addr/dst_port,
than how about reserving some space for those instead?

>> The same is about key: I don't think you need to define/use
>> tcp_authopt_alg.
>> Just use algo name - that way TCP-AO will automatically be able to use
>> any algo supported by crypto engine.
>> See how xfrm does it, e.g.:
>> struct xfrm_algo_auth {
>>      char        alg_name[64];
>>      unsigned int    alg_key_len;    /* in bits */
>>      unsigned int    alg_trunc_len;  /* in bits */
>>      char        alg_key[0];
>> };
>>
>> So you can let a user chose maclen instead of hard-coding it.
>> Much more extendable than what you propose.
>
> This complicates ABI and implementation with features that are not
> needed. I'd much rather only expose an enum of real-world tcp-ao
> algorithms.

I see it exactly the opposite way: a new enum unnecessary complicates
ABI, instead of passing alg_name[] to crypto engine. No need to add any
support in tcp-ao as the algorithms are already provided by kernel.
That is how it transparently works for ipsec, why not for tcp-ao?

>
>> [..]
>>> +#ifdef CONFIG_TCP_AUTHOPT
>>> +       case TCP_AUTHOPT: {
>>> +               struct tcp_authopt info;
>>> +
>>> +               if (get_user(len, optlen))
>>> +                       return -EFAULT;
>>> +
>>> +               lock_sock(sk);
>>> +               tcp_get_authopt_val(sk, &info);
>>> +               release_sock(sk);
>>> +
>>> +               len = min_t(unsigned int, len, sizeof(info));
>>> +               if (put_user(len, optlen))
>>> +                       return -EFAULT;
>>> +               if (copy_to_user(optval, &info, len))
>>> +                       return -EFAULT;
>>> +               return 0;
>>> +       }
>>
>> I'm pretty sure it's not a good choice to write partly tcp_authopt.
>> And a user has no way to check what's the correct len on this kernel.
>> Instead of len = min_t(unsigned int, len, sizeof(info)), it should be
>> if (len != sizeof(info))
>>      return -EINVAL;
>
> Purpose is to allow sockopts to grow as md5 has grown.

md5 has not grown. See above.

Another issue with your approach

+ /* If userspace optlen is too short fill the rest with zeros */
+ if (optlen > sizeof(opt))
+ return -EINVAL;
+ memset(&opt, 0, sizeof(opt));
+ if (copy_from_sockptr(&opt, optval, optlen))
+ return -EFAULT;

is that userspace compiled with updated/grew structure will fail on
older kernel. So, no extension without breaking something is possible.
Which also reminds me that currently you don't validate (opt.flags) for
unknown by kernel flags.

Extending syscalls is impossible without breaking userspace if ABI is
not designed with extensibility in mind. That was quite a big problem,
and still is. Please, read this carefully:
https://lwn.net/Articles/830666/

That is why I'm suggesting you all those changes that will be harder to
fix when/if your patches get accepted.
As an example how it should work see in copy_clone_args_from_user().

[..]

Thanks,
Dmitry