Re: [PATCH] nfc: virtual_ncidev: Add variable to check if ndev is running

From: Phi Nguyen
Date: Mon Nov 20 2023 - 13:24:00 EST


On 11/20/2023 6:45 PM, Krzysztof Kozlowski wrote:
On 20/11/2023 11:39, Nguyen Dinh Phi wrote:
mutex_lock(&vdev->mtx);
kfree_skb(vdev->send_buff);
vdev->send_buff = NULL;
+ vdev->running = false;
mutex_unlock(&vdev->mtx);
return 0;
@@ -50,7 +55,7 @@ static int virtual_nci_send(struct nci_dev *ndev, struct sk_buff *skb)
struct virtual_nci_dev *vdev = nci_get_drvdata(ndev);
mutex_lock(&vdev->mtx);
- if (vdev->send_buff) {
+ if (vdev->send_buff || !vdev->running) {

Dear Krzysztof,

I agree this defensive code.
But i think NFC submodule has to avoid this situation.(calling send function of closed nci_dev)
Could you check this?

This code looks not effective. At this point vdev->send_buff is always
false, so the additional check would not bring any value.

I don't see this fixing anything. Syzbot also does not seem to agree.

Nguyen, please test your patches against syzbot *before* sending them.
If you claim this fixes the report, please provide me the link to syzbot
test results confirming it is fixed.

I looked at syzbot dashboard and do not see this issue fixed with this
patch.

Best regards,
Krzysztof


Hi Krzysztof,

I've submitted it to syzbot, it is the test request that created at
[2023/11/20 09:39] in dashboard link
https://syzkaller.appspot.com/bug?extid=6eb09d75211863f15e3e

...and I see there two errors.

These are because I sent email wrongly and syzbot truncates the patch and can not compile

I don't know, maybe I miss something obvious (our brains like to do it
sometimes), but please explain me how this could fix anything?

Best regards,
Krzysztof


The issue arises when an skb is added to the send_buff after invoking ndev->ops->close() but before unregistering the device. In such cases, the virtual device will generate a copy of skb, but with no consumer thereafter. Consequently, this object persists indefinitely.

This problem seems to stem from the existence of time gaps between ops->close() and the destruction of the workqueue. During this interval, incoming requests continue to trigger the send function.

best regards,
Phi