Re: linux-next: manual merge of the net-next tree with the net tree

From: Daniel Borkmann
Date: Tue Sep 18 2018 - 06:15:45 EST


On 09/18/2018 11:53 AM, Daniel Borkmann wrote:
> On 09/18/2018 11:32 AM, Vakul Garg wrote:
>>> -----Original Message-----
>>> From: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
>>> Sent: Tuesday, September 18, 2018 2:57 PM
>>> To: Vakul Garg <vakul.garg@xxxxxxx>; Stephen Rothwell
>>> <sfr@xxxxxxxxxxxxxxxx>; David Miller <davem@xxxxxxxxxxxxx>;
>>> Networking <netdev@xxxxxxxxxxxxxxx>
>>> Cc: Linux-Next Mailing List <linux-next@xxxxxxxxxxxxxxx>; Linux Kernel
>>> Mailing List <linux-kernel@xxxxxxxxxxxxxxx>
>>> Subject: Re: linux-next: manual merge of the net-next tree with the net tree
>>>
>>> On 09/18/2018 11:10 AM, Vakul Garg wrote:
>>>>> -----Original Message-----
>>>>> From: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
>>>>> Sent: Tuesday, September 18, 2018 2:14 PM
>>>>> To: Stephen Rothwell <sfr@xxxxxxxxxxxxxxxx>; David Miller
>>>>> <davem@xxxxxxxxxxxxx>; Networking <netdev@xxxxxxxxxxxxxxx>
>>>>> Cc: Linux-Next Mailing List <linux-next@xxxxxxxxxxxxxxx>; Linux
>>>>> Kernel Mailing List <linux-kernel@xxxxxxxxxxxxxxx>; Vakul Garg
>>>>> <vakul.garg@xxxxxxx>
>>>>> Subject: Re: linux-next: manual merge of the net-next tree with the
>>>>> net tree
>>>>>
>>>>> On 09/18/2018 02:11 AM, Stephen Rothwell wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> Today's linux-next merge of the net-next tree got a conflict in:
>>>>>>
>>>>>> tools/testing/selftests/net/tls.c
>>>>>>
>>>>>> between commit:
>>>>>>
>>>>>> 50c6b58a814d ("tls: fix currently broken MSG_PEEK behavior")
>>>>>>
>>>>>> from the net tree and commit:
>>>>>>
>>>>>> c2ad647c6442 ("selftests/tls: Add test for recv(PEEK) spanning
>>>>>> across multiple records")
>>>>>>
>>>>>> from the net-next tree.
>>>>>>
>>>>>> I fixed it up (see below) and can carry the fix as necessary. This
>>>>>> is now fixed as far as linux-next is concerned, but any non trivial
>>>>>> conflicts should be mentioned to your upstream maintainer when your
>>>>>> tree is submitted for merging. You may also want to consider
>>>>>> cooperating with the maintainer of the conflicting tree to minimise
>>>>>> any particularly complex conflicts.
>>>>>
>>>>> The test from 50c6b58a814d supersedes the one from c2ad647c6442 so
>>>>> the recv_peek_large_buf_mult_recs could be removed; latter was also
>>>>> not working correctly due to this bug.
>>>>
>>>> Why remove recv_peek_large_buf_mult_recs if its correct?
>>>> Why not the newly added one which achieves the same thing?
>>>
>>> Hmm, not quite, on net-next kernel, the recv_peek_large_buf_mult_recs fails
>>> every time I invoke the tls test suite:
>>>
>>> # ./tls
>>> [==========] Running 28 tests from 2 test cases.
>>> [ RUN ] tls.sendfile
>>> [ OK ] tls.sendfile
>>> [ RUN ] tls.send_then_sendfile
>>> [ OK ] tls.send_then_sendfile
>>> [ RUN ] tls.recv_max
>>> [ OK ] tls.recv_max
>>> [ RUN ] tls.recv_small
>>> [ OK ] tls.recv_small
>>> [ RUN ] tls.msg_more
>>> [ OK ] tls.msg_more
>>> [ RUN ] tls.sendmsg_single
>>> [ OK ] tls.sendmsg_single
>>> [ RUN ] tls.sendmsg_large
>>> [ OK ] tls.sendmsg_large
>>> [ RUN ] tls.sendmsg_multiple
>>> [ OK ] tls.sendmsg_multiple
>>> [ RUN ] tls.sendmsg_multiple_stress
>>> [ OK ] tls.sendmsg_multiple_stress
>>> [ RUN ] tls.splice_from_pipe
>>> [ OK ] tls.splice_from_pipe
>>> [ RUN ] tls.splice_from_pipe2
>>> [ OK ] tls.splice_from_pipe2
>>> [ RUN ] tls.send_and_splice
>>> [ OK ] tls.send_and_splice
>>> [ RUN ] tls.splice_to_pipe
>>> [ OK ] tls.splice_to_pipe
>>> [ RUN ] tls.recvmsg_single
>>> [ OK ] tls.recvmsg_single
>>> [ RUN ] tls.recvmsg_single_max
>>> [ OK ] tls.recvmsg_single_max
>>> [ RUN ] tls.recvmsg_multiple
>>> [ OK ] tls.recvmsg_multiple
>>> [ RUN ] tls.single_send_multiple_recv
>>> [ OK ] tls.single_send_multiple_recv
>>> [ RUN ] tls.multiple_send_single_recv
>>> [ OK ] tls.multiple_send_single_recv
>>> [ RUN ] tls.recv_partial
>>> [ OK ] tls.recv_partial
>>> [ RUN ] tls.recv_nonblock
>>> [ OK ] tls.recv_nonblock
>>> [ RUN ] tls.recv_peek
>>> [ OK ] tls.recv_peek
>>> [ RUN ] tls.recv_peek_multiple
>>> [ OK ] tls.recv_peek_multiple
>>> [ RUN ] tls.recv_peek_large_buf_mult_recs
>>> tls.c:524:tls.recv_peek_large_buf_mult_recs:Expected memcmp(test_str,
>>> buf, len) (18446744073709551595) == 0 (0)
>>> tls.recv_peek_large_buf_mult_recs: Test failed at step #8
>>> [ FAIL ] tls.recv_peek_large_buf_mult_recs
>>> [ RUN ] tls.pollin
>>> [ OK ] tls.pollin
>>> [ RUN ] tls.poll_wait
>>> [ OK ] tls.poll_wait
>>> [ RUN ] tls.blocking
>>> [ OK ] tls.blocking
>>> [ RUN ] tls.nonblocking
>>> [ OK ] tls.nonblocking
>>> [ RUN ] tls.control_msg
>>> [ OK ] tls.control_msg
>>> [==========] 27 / 28 tests passed.
>>> [ FAILED ]
>>>
>>> Here's what the recvfrom() with MSG_PEEK sees:
>>>
>>> [pid 2602] socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3 [pid 2602]
>>> socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 4 [pid 2602] bind(4,
>>> {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) =
>>> 0
>>> [pid 2602] listen(4, 10) = 0
>>> [pid 2602] getsockname(4, {sa_family=AF_INET, sin_port=htons(41483),
>>> sin_addr=inet_addr("0.0.0.0")}, [16]) = 0 [pid 2602] connect(3,
>>> {sa_family=AF_INET, sin_port=htons(41483), sin_addr=inet_addr("0.0.0.0")},
>>> 16) = 0 [pid 2602] setsockopt(3, SOL_TCP, 0x1f /* TCP_??? */, [7564404], 4)
>>> = 0 [pid 2602] setsockopt(3, 0x11a /* SOL_?? */, 1,
>>> "\3\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
>>> 40) = 0 [pid 2602] accept(4, {sa_family=AF_INET, sin_port=htons(46290),
>>> sin_addr=inet_addr("127.0.0.1")}, [16]) = 5 [pid 2602] setsockopt(5,
>>> SOL_TCP, 0x1f /* TCP_??? */, [7564404], 4) = 0 [pid 2602] setsockopt(5,
>>> 0x11a /* SOL_?? */, 2,
>>> "\3\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
>>> 40) = 0
>>> [pid 2602] close(4) = 0
>>> [pid 2602] sendto(3, "test_read_peek", 14, 0, NULL, 0) = 14 [pid 2602]
>>> sendto(3, "_mult_recs\0", 11, 0, NULL, 0) = 11 [pid 2602] recvfrom(5,
>>> "test_read_peektest_read_peektest"..., 64, MSG_PEEK, NULL, NULL) = 64
>>> [pid 2602] write(2, "tls.c:526:tls.recv_peek_large_bu"...,
>>> 112tls.c:526:tls.recv_peek_large_buf_mult_recs:Expected memcmp(test_str,
>>> buf, len) (18446744073709551595) == 0 (0)
>>> ) = 112
>>> [pid 2602] close(3) = 0
>>> [pid 2602] close(5) = 0
>>> [pid 2602] exit_group(8) = ?
>>>
>>> Reason for the "test_read_peektest_read_peektest[...]" is because
>>> MSG_PEEK cannot call tls_sw_advance_skb(), since the skb is sitting there
>>> that needs to be consumed for non-MSG_PEEK case, and only then we can
>>> advance it.
>>
>> I general, my plan was to modify the tls_sw_recvmsg() to trigger as many
>> decryption as possible as required by requested user space PEEK size.
>> This would have required creating a pending list of decrypted records in tls_tx context.
>
> Right, had been thinking the same though for a fix in -net it would have been
> way too intrusive, hence the 50c6b58a814d ("tls: fix currently broken MSG_PEEK
> behavior") to avoid looping the same record which is clearly a bug. Wondering
> if DaveW's original rationale was to avoid accumulating too many records in the
> kernel since we would need to unpause strparser and keep processing the deeper
> we peek.
>
>>> Could you elaborate on where you ever had this test succeeding? With nxp
>>> accelerator?
>>
>> I never had this test succeeding. I pointed the problem to Dave Watson sometime
>> back (found during code reading).
>>
>> To make sure that this bug does not slip out, I simply submitted a test case to keep
>> reminding ourselves that we need to fix it sometime.
>
> Ok, I think usually tests assert current kernel behavior to make sure any changes
> coming in don't accidentally break expectations from applications as opposed to
> future tests that still need fixing, but I guess I'm fine either way how to resolve
> the conflict; leaving it up to DaveM. Thanks for clarifying!

By the way, full commit message from c2ad647c6442 said:

selftests/tls: Add test for recv(PEEK) spanning across multiple records

Added test case to receive multiple records with a single recvmsg()
operation with a MSG_PEEK set.

>From reading it, the expectation would normally be that the test case would
succeed for the author, I think in future such things definitely need to be
better clarified in the commit log to avoid confusion for others.

Thanks,
Daniel