Re: [PATCH net v5 2/2] nfc: nfcmrvl: main: reorder destructive operations in nfcmrvl_nci_unregister_dev to avoid bugs

From: Krzysztof Kozlowski
Date: Fri Apr 29 2022 - 03:27:57 EST


On 29/04/2022 03:14, Duoming Zhou wrote:
> There are destructive operations such as nfcmrvl_fw_dnld_abort and
> gpio_free in nfcmrvl_nci_unregister_dev. The resources such as firmware,
> gpio and so on could be destructed while the upper layer functions such as
> nfcmrvl_fw_dnld_start and nfcmrvl_nci_recv_frame is executing, which leads
> to double-free, use-after-free and null-ptr-deref bugs.
>
> There are three situations that could lead to double-free bugs.
>
> The first situation is shown below:
>
> (Thread 1) | (Thread 2)
> nfcmrvl_fw_dnld_start |
> ... | nfcmrvl_nci_unregister_dev
> release_firmware() | nfcmrvl_fw_dnld_abort
> kfree(fw) //(1) | fw_dnld_over
> | release_firmware
> ... | kfree(fw) //(2)
> | ...
>
> The second situation is shown below:
>
> (Thread 1) | (Thread 2)
> nfcmrvl_fw_dnld_start |
> ... |
> mod_timer |
> (wait a time) |
> fw_dnld_timeout | nfcmrvl_nci_unregister_dev
> fw_dnld_over | nfcmrvl_fw_dnld_abort
> release_firmware | fw_dnld_over
> kfree(fw) //(1) | release_firmware
> ... | kfree(fw) //(2)

How exactly the case here is being prevented?

If nfcmrvl_nci_unregister_dev() happens slightly earlier, before
fw_dnld_timeout() on the left side (T1), the T1 will still hit it, won't it?


>
> The third situation is shown below:
>
> (Thread 1) | (Thread 2)
> nfcmrvl_nci_recv_frame |
> if(..->fw_download_in_progress)|
> nfcmrvl_fw_dnld_recv_frame |
> queue_work |
> |
> fw_dnld_rx_work | nfcmrvl_nci_unregister_dev
> fw_dnld_over | nfcmrvl_fw_dnld_abort
> release_firmware | fw_dnld_over
> kfree(fw) //(1) | release_firmware
> | kfree(fw) //(2)
>
> The firmware struct is deallocated in position (1) and deallocated
> in position (2) again.
>
> The crash trace triggered by POC is like below:
>
> [ 122.640457] BUG: KASAN: double-free or invalid-free in fw_dnld_over+0x28/0xf0
> [ 122.640457] Call Trace:
> [ 122.640457] <TASK>
> [ 122.640457] kfree+0xb0/0x330
> [ 122.640457] fw_dnld_over+0x28/0xf0
> [ 122.640457] nfcmrvl_nci_unregister_dev+0x61/0x70
> [ 122.640457] nci_uart_tty_close+0x87/0xd0
> [ 122.640457] tty_ldisc_kill+0x3e/0x80
> [ 122.640457] tty_ldisc_hangup+0x1b2/0x2c0
> [ 122.640457] __tty_hangup.part.0+0x316/0x520
> [ 122.640457] tty_release+0x200/0x670
> [ 122.640457] __fput+0x110/0x410
> [ 122.640457] task_work_run+0x86/0xd0
> [ 122.640457] exit_to_user_mode_prepare+0x1aa/0x1b0
> [ 122.640457] syscall_exit_to_user_mode+0x19/0x50
> [ 122.640457] do_syscall_64+0x48/0x90
> [ 122.640457] entry_SYSCALL_64_after_hwframe+0x44/0xae
> [ 122.640457] RIP: 0033:0x7f68433f6beb

Please always strip unrelated parts of oops, so minimum the timing. The
addresses could be removed as well.

>
> What's more, there are also use-after-free and null-ptr-deref bugs
> in nfcmrvl_fw_dnld_start. If we deallocate firmware struct, gpio or
> set null to the members of priv->fw_dnld in nfcmrvl_nci_unregister_dev,
> then, we dereference firmware, gpio or the members of priv->fw_dnld in
> nfcmrvl_fw_dnld_start, the UAF or NPD bugs will happen.
>
> This patch reorders destructive operations after nci_unregister_device
> and uses bool variable in nfc_dev to synchronize between cleanup routine
> and firmware download routine. The process is shown below.
>
> (Thread 1) | (Thread 2)
> nfcmrvl_nci_unregister_dev |
> nci_unregister_device |
> nfc_unregister_device | nfc_fw_download
> device_lock() |
> ... |
> dev->dev_register = false; | ...
> device_unlock() |
> ... | device_lock()
> (destructive operations) | if(!dev->dev_register)
> | goto error;
> | error:
> | device_unlock()

How T2 calls appeared here? They were not present in any of your
previous process-examples...

>
> If the device is detaching, the download function will goto error.
> If the download function is executing, nci_unregister_device will
> wait until download function is finished.
>
> Fixes: 3194c6870158 ("NFC: nfcmrvl: add firmware download support")
> Signed-off-by: Duoming Zhou <duoming@xxxxxxxxxx>
> ---
> Changes in v5:
> - Use bool variable added in patch 1/2 to synchronize.
>
> drivers/nfc/nfcmrvl/main.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/nfc/nfcmrvl/main.c b/drivers/nfc/nfcmrvl/main.c
> index 2fcf545012b..1a5284de434 100644
> --- a/drivers/nfc/nfcmrvl/main.c
> +++ b/drivers/nfc/nfcmrvl/main.c
> @@ -183,6 +183,7 @@ void nfcmrvl_nci_unregister_dev(struct nfcmrvl_private *priv)
> {
> struct nci_dev *ndev = priv->ndev;
>
> + nci_unregister_device(ndev);
> if (priv->ndev->nfc_dev->fw_download_in_progress)
> nfcmrvl_fw_dnld_abort(priv);
>
> @@ -191,7 +192,6 @@ void nfcmrvl_nci_unregister_dev(struct nfcmrvl_private *priv)
> if (gpio_is_valid(priv->config.reset_n_io))
> gpio_free(priv->config.reset_n_io);
>
> - nci_unregister_device(ndev);
> nci_free_device(ndev);
> kfree(priv);
> }


Best regards,
Krzysztof