Re: Re: [PATCH net v5 2/2] nfc: nfcmrvl: main: reorder destructive operations in nfcmrvl_nci_unregister_dev to avoid bugs

From: duoming
Date: Fri Apr 29 2022 - 05:14:20 EST


Hello,

On Fri, 29 Apr 2022 09:27:48 +0200 Krzysztof wrote:

> > There are destructive operations such as nfcmrvl_fw_dnld_abort and
> > gpio_free in nfcmrvl_nci_unregister_dev. The resources such as firmware,
> > gpio and so on could be destructed while the upper layer functions such as
> > nfcmrvl_fw_dnld_start and nfcmrvl_nci_recv_frame is executing, which leads
> > to double-free, use-after-free and null-ptr-deref bugs.
> >
> > There are three situations that could lead to double-free bugs.
> >
> > The first situation is shown below:
> >
> > (Thread 1) | (Thread 2)
> > nfcmrvl_fw_dnld_start |
> > ... | nfcmrvl_nci_unregister_dev
> > release_firmware() | nfcmrvl_fw_dnld_abort
> > kfree(fw) //(1) | fw_dnld_over
> > | release_firmware
> > ... | kfree(fw) //(2)
> > | ...
> >
> > The second situation is shown below:
> >
> > (Thread 1) | (Thread 2)
> > nfcmrvl_fw_dnld_start |
> > ... |
> > mod_timer |
> > (wait a time) |
> > fw_dnld_timeout | nfcmrvl_nci_unregister_dev
> > fw_dnld_over | nfcmrvl_fw_dnld_abort
> > release_firmware | fw_dnld_over
> > kfree(fw) //(1) | release_firmware
> > ... | kfree(fw) //(2)
>
> How exactly the case here is being prevented?
>
> If nfcmrvl_nci_unregister_dev() happens slightly earlier, before
> fw_dnld_timeout() on the left side (T1), the T1 will still hit it, won't it?

I think it could be prevented. We use nci_unregister_device() to synchronize, if the
firmware download routine is running, the cleanup routine will wait it to finish.
The flag "fw_download_in_progress" will be set to false, if the the firmware download
routine is finished.

Although the timer handler fw_dnld_timeout() could be running, nfcmrvl_nci_unregister_dev()
will check the flag "fw_download_in_progress" which is already set to false and nfcmrvl_fw_dnld_abort()
in nfcmrvl_nci_unregister_dev() will not execute.

> >
> > The third situation is shown below:
> >
> > (Thread 1) | (Thread 2)
> > nfcmrvl_nci_recv_frame |
> > if(..->fw_download_in_progress)|
> > nfcmrvl_fw_dnld_recv_frame |
> > queue_work |
> > |
> > fw_dnld_rx_work | nfcmrvl_nci_unregister_dev
> > fw_dnld_over | nfcmrvl_fw_dnld_abort
> > release_firmware | fw_dnld_over
> > kfree(fw) //(1) | release_firmware
> > | kfree(fw) //(2)
> >
> > The firmware struct is deallocated in position (1) and deallocated
> > in position (2) again.
> >
> > The crash trace triggered by POC is like below:
> >
> > [ 122.640457] BUG: KASAN: double-free or invalid-free in fw_dnld_over+0x28/0xf0
> > [ 122.640457] Call Trace:
> > [ 122.640457] <TASK>
> > [ 122.640457] kfree+0xb0/0x330
> > [ 122.640457] fw_dnld_over+0x28/0xf0
> > [ 122.640457] nfcmrvl_nci_unregister_dev+0x61/0x70
> > [ 122.640457] nci_uart_tty_close+0x87/0xd0
> > [ 122.640457] tty_ldisc_kill+0x3e/0x80
> > [ 122.640457] tty_ldisc_hangup+0x1b2/0x2c0
> > [ 122.640457] __tty_hangup.part.0+0x316/0x520
> > [ 122.640457] tty_release+0x200/0x670
> > [ 122.640457] __fput+0x110/0x410
> > [ 122.640457] task_work_run+0x86/0xd0
> > [ 122.640457] exit_to_user_mode_prepare+0x1aa/0x1b0
> > [ 122.640457] syscall_exit_to_user_mode+0x19/0x50
> > [ 122.640457] do_syscall_64+0x48/0x90
> > [ 122.640457] entry_SYSCALL_64_after_hwframe+0x44/0xae
> > [ 122.640457] RIP: 0033:0x7f68433f6beb
>
> Please always strip unrelated parts of oops, so minimum the timing. The
> addresses could be removed as well.

Thanks, I will strip unrelated parts of oops.

> >
> > What's more, there are also use-after-free and null-ptr-deref bugs
> > in nfcmrvl_fw_dnld_start. If we deallocate firmware struct, gpio or
> > set null to the members of priv->fw_dnld in nfcmrvl_nci_unregister_dev,
> > then, we dereference firmware, gpio or the members of priv->fw_dnld in
> > nfcmrvl_fw_dnld_start, the UAF or NPD bugs will happen.
> >
> > This patch reorders destructive operations after nci_unregister_device
> > and uses bool variable in nfc_dev to synchronize between cleanup routine
> > and firmware download routine. The process is shown below.
> >
> > (Thread 1) | (Thread 2)
> > nfcmrvl_nci_unregister_dev |
> > nci_unregister_device |
> > nfc_unregister_device | nfc_fw_download
> > device_lock() |
> > ... |
> > dev->dev_register = false; | ...
> > device_unlock() |
> > ... | device_lock()
> > (destructive operations) | if(!dev->dev_register)
> > | goto error;
> > | error:
> > | device_unlock()
>
> How T2 calls appeared here? They were not present in any of your
> previous process-examples...

The firmware download process is shown below:

nfc_genl_fw_download()->nfc_fw_download()->nci_fw_download()->nfcmrvl_nci_fw_download()
->nfcmrvl_fw_dnld_start()

So T2 calls is a part of firmware download routine in nfc core layer.

We use bool variable and device_lock() to synchronize between cleanup routine and
firmware download routine in nfc core layer. The above picture is attempt to show
this synchronized process.

> >
> > If the device is detaching, the download function will goto error.
> > If the download function is executing, nci_unregister_device will
> > wait until download function is finished.
> >
> > Fixes: 3194c6870158 ("NFC: nfcmrvl: add firmware download support")
> > Signed-off-by: Duoming Zhou <duoming@xxxxxxxxxx>
> > ---
> > Changes in v5:
> > - Use bool variable added in patch 1/2 to synchronize.
> >
> > drivers/nfc/nfcmrvl/main.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/nfc/nfcmrvl/main.c b/drivers/nfc/nfcmrvl/main.c
> > index 2fcf545012b..1a5284de434 100644
> > --- a/drivers/nfc/nfcmrvl/main.c
> > +++ b/drivers/nfc/nfcmrvl/main.c
> > @@ -183,6 +183,7 @@ void nfcmrvl_nci_unregister_dev(struct nfcmrvl_private *priv)
> > {
> > struct nci_dev *ndev = priv->ndev;
> >
> > + nci_unregister_device(ndev);
> > if (priv->ndev->nfc_dev->fw_download_in_progress)
> > nfcmrvl_fw_dnld_abort(priv);
> >
> > @@ -191,7 +192,6 @@ void nfcmrvl_nci_unregister_dev(struct nfcmrvl_private *priv)
> > if (gpio_is_valid(priv->config.reset_n_io))
> > gpio_free(priv->config.reset_n_io);
> >
> > - nci_unregister_device(ndev);
> > nci_free_device(ndev);
> > kfree(priv);
> > }


Best regards,
Duoming Zhou