Re: [PATCH] net/tls: Fix slab-use-after-free in tls_encrypt_done

From: Paolo Abeni
Date: Tue Oct 17 2023 - 06:32:33 EST


On Thu, 2023-10-12 at 19:02 +0800, Juntong Deng wrote:
> In the current implementation, ctx->async_wait.completion is completed
> after spin_lock_bh, which causes tls_sw_release_resources_tx to
> continue executing and return to tls_sk_proto_cleanup, then return
> to tls_sk_proto_close, and after that enter tls_sw_free_ctx_tx to kfree
> the entire struct tls_context (including ctx->encrypt_compl_lock).
>
> Since ctx->encrypt_compl_lock has been freed, subsequent spin_unlock_bh
> will result in slab-use-after-free error. Due to SMP, even using
> spin_lock_bh does not prevent tls_sw_release_resources_tx from continuing
> on other CPUs. After tls_sw_release_resources_tx is woken up, there is no
> attempt to hold ctx->encrypt_compl_lock again, therefore everything
> described above is possible.
>
> The fix is to put complete(&ctx->async_wait.completion) after
> spin_unlock_bh, making the release after the unlock. Since complete is
> only executed if pending is 0, which means this is the last record, there
> is no need to worry about race condition causing duplicate completes.
>
> Reported-by: syzbot+29c22ea2d6b2c5fd2eae@xxxxxxxxxxxxxxxxxxxxxxxxx
> Closes: https://syzkaller.appspot.com/bug?extid=29c22ea2d6b2c5fd2eae
> Signed-off-by: Juntong Deng <juntong.deng@xxxxxxxxxxx>

Have you tested this patch vs the syzbot reproducer?

I think the following race is still present:

CPU0 CPU1
tls_sw_release_resources_tx tls_encrypt_done
spin_lock_bh
spin_unlock_bh
spin_lock_bh
spin_unlock_bh
complete

wait
// ...
tls_sk_proto_close

test_and_set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask
// UaF

regardless of 'complete()' being invoked before or after the
'spin_unlock_bh()'.

Paolo