Re: [PATCH v8.1 net-next 00/23] net/tcp: Add TCP-AO support

From: David Ahern
Date: Fri Jul 21 2023 - 14:12:33 EST


On 7/21/23 10:18 AM, Dmitry Safonov wrote:
> Hi,
>
> This is version 8.1 of TCP-AO support. I base it on net-next as
> there's commit 5e5265522a9a ("tcp: annotate data-races around
> tcp_rsk(req)->txhash") which makes a minor conflict.
>
> The good news is that all pre-required patches have merged to
> Torvald's/master. Thanks to Herbert, crypto clone-tfm just works on
> master for all TCP-AO supported algorithms.
> So, this is the first version of the patch set that has only net-related
> changes (well, selftests as well, but they'll be upstreamed separately).
>
> In this version, I've finally spent time and written Documentation/ page
> on TCP-AO. It has Frequently Asked Questions (FAQ) on RFC 5925 - I found
> it very useful to answer those before writing the actual code.
> It provides answers to common questions that arise on a quick read of
> the RFC as well as how they were answered. There's also a comparison
> to the TCP-MD5 option, an evaluation of per-socket vs in-kernel-DB
> approaches and a description of uAPI provided.
> I hope it will be as useful for reviewing the code as it was for writing.
>
> The most important changes in this version are:
> - CONFIG_TCP_AO implies CONFIG_IPV6 != m. I don't feel like that
> combination would be useful to anyone and it'd be painful to fix.
> - uAPI change in TCP_AO_REPAIR (introduced in version 7): I removed
> {snd,rcv}_sne_seq counters. They were just copies of snd_nxt/snd_una.
> No reason for polluting uAPI as well as needlessly copying them.
> - TCP_AO_MAX_HASH_SIZE is removed and all temporary buffers are
> kmalloc()'d. That also saves a couple of bytes for hmac(sha1) and
> cmac(aes128) traffic keys as they now are allocated with
> exact hash algo's digest_size.
>
> There's an independent patch set for TCP-MD5 to verify segments on twsk:
> https://lore.kernel.org/all/20230509221608.2569333-1-dima@xxxxxxxxxx/T/#u
> That may be used to verify TCP-AO segments on twsk as well.
>
> There seem to be more people that connected me off-list asking me about
> the status of patches and when I expect them to merge. Cc'ing more
> interested parties here (ping me directly if you don't want to be in
> copy). It would be helpful if you provide your reviews and tested-by's.
> As far as I'm aware, version 7 was ported to RHEL, so now there are
> probably more downstream kernels with TCP-AO support.
>
> Also available as a git branch for pulling:
> https://github.com/0x7f454c46/linux/tree/tcp-ao-v8.1
> And another branch with selftests, that will be sent later separately:
> https://github.com/0x7f454c46/linux/tree/tcp-ao-v8-with-selftests
>
> Thanks for your time and reviews,
> Dmitry
>
> --- Changelog ---
>
> Changes from v8:
> - Rebased/retested on linux-net-next
>
> Version 8: https://lore.kernel.org/all/20230719202631.472019-1-dima@xxxxxxxxxx/T/#u
>
> Changes from v7:
> - Fixed copy'n'paste typo in unsigned-md5.c selftest output
> - Fix build error in tcp_v6_send_reset() (kernel test robot <lkp@xxxxxxxxx>)
> - Make CONFIG_TCP_AO imply IPV6 != m
> - Cleanup EXPORT_SYMBOL*() as they aren't needed with IPV6 != m
> - Used scratch area instead of on-stack buffer for scatter-gather list
> in tcp_v{4,6}_ao_calc_key(). Fixes CONFIG_VMAP_STACK=y + CONFIG_DEBUG_SG=y
> - Allocated digest_size'd buffers for traffic keys in tcp_ao_key instead
> of maximum-sized buffers of TCP_AO_MAX_HASH_SIZE. That will save
> little space per key and also potentially allow algorithms with
> digest size > TCP_AO_MAX_HASH_SIZE.
> - Removed TCP_AO_MAX_HASH_SIZE and used kmalloc(GFP_ATOMIC) instead of
> on-stack hash buffer.
> - Don't treat fd=0 as invalid in selftests
> - Make TCP-AO selftests work with CONFIG_CRYPTO_FIPS=y
> - Don't tcp_ao_compute_sne() for snd_sne on twsk: it's redundant as
> no data can be sent on twsk
> - Get rid of {snd,rcv}_sne_seq: use snd_nxt/snd_una or rcv_nxt instead
> - {rcv,snd}_sne and tcp_ao_compute_sne() now are introduced in
> "net/tcp: Add TCP-AO SNE support" patch
> - trivial copy_to_sockptr() fixup for tcp_ao_get_repair() - it could
> try copying bigger struct than the kernel one (embarrassing!)
> - Added Documentation/networking/tcp_ao.rst that describes:
> uAPI, has FAQ on RFC 5925 and has implementation details of Linux TCP-AO
>
> Version 7: https://lore.kernel.org/all/20230614230947.3954084-1-dima@xxxxxxxxxx/T/#u
>
> Changes from v6:
> - Some more trivial build warnings fixups (kernel test robot <lkp@xxxxxxxxx>)
> - Added TCP_AO_REPAIR setsockopt(), getsockopt()
> - Allowed TCP_AO_* setsockopts if (tp->repair) is on
> - Added selftests for TCP_AO_REPAIR, that also check incorrect
> ISNs/SNEs, which result in a broken TCP-AO connection - that verifies
> that both Initial Sequence Numbers and Sequence Number Extension are
> part of MAC generation
> - Using TCP_AO_REPAIR added a selftest for SEQ numbers rollover,
> checking that SNE was incremented, connection is alive post-rolloever
> and no TCP segments with a wrong signature arrived
> - Wrote a selftest for RST segments: both active reset (goes through
> transmit_skb()) and passive reset (goes through tcp_v{4,6}_send_reset()).
> - Refactored and made readable tcp_v{4,6}_send_reset(), also adding
> support for TCP_LISTEN/TCP_NEW_SYN_RECV
> - Dropped per-CPU ahash requests allocations in favor of Herbert's
> clone-tfm crypto API
> - Added Donald Cassidy to Cc as he's interested in getting it into RHEL.
>
> Version 6: https://lore.kernel.org/all/20230512202311.2845526-1-dima@xxxxxxxxxx/T/#u
>
> iperf[3] benchmarks for version 6:
> v6.4-rc1 TCP-AO-v6
> TCP 43.9 Gbits/sec 43.5 Gbits/sec
> TCP-MD5 2.20 Gbits/sec 2.25 Gbits/sec
> TCP-AO(hmac(sha1)) 2.53 Gbits/sec
> TCP-AO(hmac(sha512)) 1.67 Gbits/sec
> TCP-AO(hmac(sha384)) 1.77 Gbits/sec
> TCP-AO(hmac(sha224)) 1.29 Gbits/sec
> TCP-AO(hmac(sha3-512)) 481 Mbits/sec
> TCP-AO(hmac(md5)) 2.07 Gbits/sec
> TCP-AO(hmac(rmd160)) 1.01 Gbits/sec
> TCP-AO(cmac(aes128)) 2.11 Gbits/sec
>
> Changes from v5:
> - removed check for TCP_AO_KEYF_IFINDEX in delete command:
> VRF might have been destroyed, there still needs to be a way to delete
> keys that were bound to that l3intf (should tcp_v{4,6}_parse_md5_keys()
> avoid the same check as well?)
> - corrected copy'n'paste typo in tcp_ao_info_cmd() (assign ao_info->rnext_key)
> - simplified a bit tcp_ao_copy_mkts_to_user(); added more UAPI checks
> for getsockopt(TCP_AO_GET_KEYS)
> - More UAPI selftests in setsockopt-closed: 29 => 120
> - ported TCP-AO patches on Herbert's clone-tfm changes
> - adjusted iperf patch for TCP-AO UAPI changes from version 5
> - added measures for TCP-AO with tcp_sigpool & clone_tfm backends
>
> Version 5: https://lore.kernel.org/all/20230403213420.1576559-1-dima@xxxxxxxxxx/T/#u
>
> Changes from v4:
> - Renamed tcp_ao_matched_key() => tcp_ao_established_key()
> - Missed `static` in function definitions
> (kernel test robot <lkp@xxxxxxxxx>)
> - Fixed CONFIG_IPV6=m build
> - Unexported tcp_md5_*_sigpool() functions
> - Cleaned up tcp_ao.h: undeclared tcp_ao_cache_traffic_keys(),
> tcp_v4_ao_calc_key_skb(); removed tcp_v4_inbound_ao_hash()
> - Marked "net/tcp: Prepare tcp_md5sig_pool for TCP-AO" as a [draft] patch
> - getsockopt() now returns TCP-AO per-key counters
> - Another getsockopt() now returns per-ao_info stats: counters
> and accept_icmps flag state
> - Wired up getsockopt() returning counters to selftests
> - Fixed a porting mistake: TCP-AO hash in some cases was written in TCP
> header without accounting for MAC length of the key, rewritting skb
> shared info
> - Fail adding a key with L3 ifindex when !TCP_AO_KEYF_IFINDEX, instead
> of ignoring tcpa_ifindex (stricter UAPI check)
> - Added more test-cases to setsockopt-closed.c selftest
> - tcp_ao_hash_skb_data() was a copy'n'paste of tcp_md5_hash_skb_data()
> share it now under tcp_sigpool_hash_skb_data()
> - tcp_ao_mkt_overlap_v{4,6}() deleted as they just re-invented
> tcp_ao_do_lookup(). That fixes an issue with multiple IPv4-mapped-IPv6
> keys for different peers on a listening socket.
> - getsockopt() now is tested to return correct VRF number for a key
> - TCP-AO and TCP-MD5 interraction in non/default VRFs: added +19 selftests
> made them SKIP when CONFIG_VRF=n
> - unsigned-md5 selftests now checks both scenarios:
> (1) adding TCP-AO key _after_ TCP-MD5 key
> (2) adding TCP-MD5 key _after_ TCP-AO key
> - Added a ratelimited warning if TCP-AO key.ifindex doesn't match
> sk->sk_bound_dev_if - that will warn a user for potential VRF issues
> - tcp_v{4,6}_parse_md5_keys() now allows adding TCP-MD5 key with
> ifindex=0 and TCP_MD5SIG_FLAG_IFINDEX together with TCP-AO key from
> another VRF
> - Add TCP_AO_CMDF_AO_REQUIRED, which makes a socket TCP-AO only,
> rejecting TCP-MD5 keys or any unsigned TCP segments
> - Remove `tcpa_' prefix for UAPI structure members
> - UAPI cleanup: I've separated & renamed per-socket settings
> (such as ao_info flags + current/rnext set) from per-key changes:
> TCP_AO => TCP_AO_ADD_KEY
> TCP_AO_DEL => TCP_AO_DEL_KEY
> TCP_AO_GET => TCP_AO_GET_KEYS
> TCP_AO_MOD => TCP_AO_INFO, the structure is now valid for both
> getsockopt() and setsockopt().
> - tcp_ao_current_rnext() was split up in order to fail earlier when
> sndid/rcvid specified can't be set, before anything was changed in ao_info
> - fetch current_key before dumping TCP-AO keys in getsockopt(TCP_AO_GET_KEYS):
> it may race with changing current_key by RX, which in result might
> produce a dump with no current_key for userspace.
> - instead of TCP_AO_CMDF_* flags, used bitfileds: the flags weren't
> shared between all TCP_AO_{ADD,GET,DEL}_KEY{,S}, so bitfields are more
> descriptive here
> - use READ_ONCE()/WRITE_ONCE() for current_key and rnext_key more
> consistently; document in comment the rules for accessing them
> - selftests: check all setsockopts()/getsockopts() support extending
> option structs
>
> Version 4: https://lore.kernel.org/all/20230215183335.800122-1-dima@xxxxxxxxxx/T/#u
>
> Changes from v3:
> - TCP_MD5 dynamic static key enable/disable patches merged separately [4]
> - crypto_pool patches were nacked [5], so instead this patch set extends
> TCP-MD5-sigpool to be used for TCP-AO as well as for TCP-MD5
> - Added missing `static' for tcp_v6_ao_calc_key()
> (kernel test robot <lkp@xxxxxxxxx>)
> - Removed CONFIG_TCP_AO default=y and added "If unsure, say N."
> - Don't leak ao_info and don't create an unsigned TCP socket if there was
> a TCP-AO key during handshake, but it was removed from listening socket
> while the connection was being established
> - Migrate to use static_key_fast_inc_not_disabled() and check return
> code of static_branch_inc()
> - Change some return codes to EAFNOSUPPORT for error-pathes where
> family is neither AF_INET nor AF_INET6
> - setsockopt()s on a closed/listen socket might have created stray ao_info,
> remove it if connect() is called with a correct TCP-MD5 key, the same
> for the reverse situation: remove md5sig_info straight away from the
> socket if it's going to be TCP-AO connection
> - IPv4-mapped-IPv6 addresses + selftest in fcnal-test.sh (by Salam)
> - fix using uninitialized sisn/disn from stack - it would only make
> non-SYN packets fail verification on a listen socket, which are not
> expected anyway (kernel test robot <lkp@xxxxxxxxx>)
> - implicit padding in UAPI TCP-AO structures converted to explicit
> (spotted-by David Laight)
> - Some selftests missed zero-initializers for uapi structs on stack
> - Removed tcp_ao_do_lookup_rcvid() and tcp_ao_do_lookup_sndid() in
> favor of unified tcp_ao_matched_key()
> - Disallowed setting current/rnext keys on listen sockets - that wasn't
> supported and didn't affect anything, cleanup for the UAPI
> - VRFs support for TCP-AO
>
> Version 3: https://lore.kernel.org/all/20221027204347.529913-1-dima@xxxxxxxxxx/T/#u
>
> Changes from v2:
> - Added more missing `static' declarations for local functions
> (kernel test robot <lkp@xxxxxxxxx>)
> - Building now with CONFIG_TCP_AO=n and CONFIG_TCP_MD5SIG=n
> (kernel test robot <lkp@xxxxxxxxx>)
> - Now setsockopt(TCP_AO) is allowed when it's TCP_LISTEN or TCP_CLOSE
> state OR the key added is not the first key on a socket (by Salam)
> - CONFIG_TCP_AO does not depend on CONFIG_TCP_MD5SIG anymore
> - Don't leak tcp_md5_needed static branch counter when TCP-MD5 key
> is modified/changed
> - TCP-AO lookups are dynamically enabled/disabled with static key when
> there is ao_info in the system (and when it is destroyed)
> - Wired SYN cookies up to TCP-AO (by Salam)
> - Fix verification for possible re-transmitted SYN packets (by Salam)
> - use sockopt_lock_sock() instead of lock_sock()
> (from v6.1 rebase, commit d51bbff2aba7)
> - use sockptr_t in getsockopt(TCP_AO_GET)
> (from v6.1 rebase, commit 34704ef024ae)
> - Fixed reallocating crypto_pool's scratch area by IPI while
> crypto_pool_get() was get by another CPU
> - selftests on older kernels (or with CONFIG_TCP_AO=n) should exit with
> SKIP, not FAIL (Shuah Khan <shuah@xxxxxxxxxx>)
> - selftests that check interaction between TCP-AO and TCP-MD5 now
> SKIP when CONFIG_TCP_MD5SIG=n
> - Measured the performance of different hashing algorithms for TCP-AO
> and compare with TCP-MD5 performance. This is done with hacky patches
> to iperf (see [3]). At this moment I've done it in qemu/KVM with CPU
> affinities set on Intel(R) Core(TM) i7-7600U CPU @ 2.80GHz.
> No performance degradation was noticed before/after patches, but given
> the measures were done in a VM, without measuring it on a physical dut
> it only gives a hint of relative speed for different hash algorithms
> with TCP-AO. Here are results, averaging on 30 measures each:
> TCP: 3.51Gbits/sec
> TCP-MD5: 1.12Gbits/sec
> TCP-AO(HMAC(SHA1)): 1.53Gbits/sec
> TCP-AO(CMAC(AES128)): 621Mbits/sec
> TCP-AO(HMAC(SHA512)): 1.21Gbits/sec
> TCP-AO(HMAC(SHA384)): 1.20Gbits/sec
> TCP-AO(HMAC(SHA224)): 961Mbits/sec
> TCP-AO(HMAC(SHA3-512)): 157Mbits/sec
> TCP-AO(HMAC(RMD160)): 659Mbits/sec
> TCP-AO(HMAC(MD5): 1.12Gbits/sec
> (the last one is just for fun, but may make sense as it provides
> the same security as TCP-MD5, but allows multiple keys and a mechanism
> to change them from RFC5925)
>
> Version 2: https://lore.kernel.org/all/20220923201319.493208-1-dima@xxxxxxxxxx/T/#u
>
> Changes from v1:
> - Building now with CONFIG_IPV6=n (kernel test robot <lkp@xxxxxxxxx>)
> - Added missing static declarations for local functions
> (kernel test robot <lkp@xxxxxxxxx>)
> - Addressed static analyzer and review comments by Dan Carpenter
> (thanks, they were very useful!)
> - Fix elif without defined() for !CONFIG_TCP_AO
> - Recursively build selftests/net/tcp_ao (Shuah Khan), patches in:
> https://lore.kernel.org/all/20220919201958.279545-1-dima@xxxxxxxxxx/T/#u
> - Don't leak crypto_pool reference when TCP-MD5 key is modified/changed
> - Add TCP-AO support for nettest.c and fcnal-test.sh
> (will be used for VRF testing in later versions)
>
> Comparison between Leonard proposal and this (overview):
> https://lore.kernel.org/all/3cf03d51-74db-675c-b392-e4647fa5b5a6@xxxxxxxxxx/T/#u
>
> Version 1: https://lore.kernel.org/all/20220818170005.747015-1-dima@xxxxxxxxxx/T/#u
>
> This patchset implements the TCP-AO option as described in RFC5925. There
> is a request from industry to move away from TCP-MD5SIG and it seems the time
> is right to have a TCP-AO upstreamed. This TCP option is meant to replace
> the TCP MD5 option and address its shortcomings. Specifically, it provides
> more secure hashing, key rotation and support for long-lived connections
> (see the summary of TCP-AO advantages over TCP-MD5 in (1.3) of RFC5925).
> The patch series starts with six patches that are not specific to TCP-AO
> but implement a general crypto facility that we thought is useful
> to eliminate code duplication between TCP-MD5SIG and TCP-AO as well as other
> crypto users. These six patches are being submitted separately in
> a different patchset [1]. Including them here will show better the gain
> in code sharing. Next are 18 patches that implement the actual TCP-AO option,
> followed by patches implementing selftests.
>
> The patch set was written as a collaboration of three authors (in alphabetical
> order): Dmitry Safonov, Francesco Ruggeri and Salam Noureddine. Additional
> credits should be given to Prasad Koya, who was involved in early prototyping
> a few years back. There is also a separate submission done by Leonard Crestez
> whom we thank for his efforts getting an implementation of RFC5925 submitted
> for review upstream [2]. This is an independent implementation that makes
> different design decisions.
>
> For example, we chose a similar design to the TCP-MD5SIG implementation and
> used setsockopts to program per-socket keys, avoiding the extra complexity
> of managing a centralized key database in the kernel. A centralized database
> in the kernel has dubious benefits since it doesn’t eliminate per-socket
> setsockopts needed to specify which sockets need TCP-AO and what are the
> currently preferred keys. It also complicates traffic key caching and
> preventing deletion of in-use keys.
>
> In this implementation, a centralized database of keys can be thought of
> as living in user space and user applications would have to program those
> keys on matching sockets. On the server side, the user application programs
> keys (MKTS in TCP-AO nomenclature) on the listening socket for all peers that
> are expected to connect. Prefix matching on the peer address is supported.
> When a peer issues a successful connect, all the MKTs matching the IP address
> of the peer are copied to the newly created socket. On the active side,
> when a connect() is issued all MKTs that do not match the peer are deleted
> from the socket since they will never match the peer. This implementation
> uses three setsockopt()s for adding, deleting and modifying keys on a socket.
> All three setsockopt()s have extensive sanity checks that prevent
> inconsistencies in the keys on a given socket. A getsockopt() is provided
> to get key information from any given socket.
>
> Few things to note about this implementation:
> - Traffic keys are cached for established connections avoiding the cost of
> such calculation for each packet received or sent.
> - Great care has been taken to avoid deleting in-use MKTs
> as required by the RFC.
> - Any crypto algorithm supported by the Linux kernel can be used
> to calculate packet hashes.
> - Fastopen works with TCP-AO but hasn’t been tested extensively.
> - Tested for interop with other major networking vendors (on linux-4.19),
> including testing for key rotation and long lived connections.
>
> [1]: https://lore.kernel.org/all/20220726201600.1715505-1-dima@xxxxxxxxxx/
> [2]: https://lore.kernel.org/all/cover.1658815925.git.cdleonard@xxxxxxxxx/
> [3]: https://github.com/0x7f454c46/iperf/tree/tcp-md5-ao
> [4]: https://lore.kernel.org/all/166995421700.16716.17446147162780881407.git-patchwork-notify@xxxxxxxxxx/T/#u
> [5]: https://lore.kernel.org/all/Y8kSkW4X4vQdFyOl@xxxxxxxxxxxxxxxxxxx/T/#u
> [6]: https://lore.kernel.org/all/ZDefxOq6Ax0JeTRH@xxxxxxxxxxxxxxxxxxx/T/#u
>

For the set:
Acked-by: David Ahern <dsahern@xxxxxxxxxx>