Re: [PATCH] nfc: virtual_ncidev: Add variable to check if ndev is running

From: Krzysztof Kozlowski
Date: Mon Nov 20 2023 - 13:29:54 EST


On 20/11/2023 19:23, Phi Nguyen wrote:
> On 11/20/2023 6:45 PM, Krzysztof Kozlowski wrote:
>> On 20/11/2023 11:39, Nguyen Dinh Phi wrote:
>>>>>> mutex_lock(&vdev->mtx);
>>>>>> kfree_skb(vdev->send_buff);
>>>>>> vdev->send_buff = NULL;
>>>>>> + vdev->running = false;
>>>>>> mutex_unlock(&vdev->mtx);
>>>>>>
>>>>>> return 0;
>>>>>> @@ -50,7 +55,7 @@ static int virtual_nci_send(struct nci_dev *ndev, struct sk_buff *skb)
>>>>>> struct virtual_nci_dev *vdev = nci_get_drvdata(ndev);
>>>>>>
>>>>>> mutex_lock(&vdev->mtx);
>>>>>> - if (vdev->send_buff) {
>>>>>> + if (vdev->send_buff || !vdev->running) {
>>>>>
>>>>> Dear Krzysztof,
>>>>>
>>>>> I agree this defensive code.
>>>>> But i think NFC submodule has to avoid this situation.(calling send function of closed nci_dev)
>>>>> Could you check this?
>>>>
>>>> This code looks not effective. At this point vdev->send_buff is always
>>>> false, so the additional check would not bring any value.
>>>>
>>>> I don't see this fixing anything. Syzbot also does not seem to agree.
>>>>
>>>> Nguyen, please test your patches against syzbot *before* sending them.
>>>> If you claim this fixes the report, please provide me the link to syzbot
>>>> test results confirming it is fixed.
>>>>
>>>> I looked at syzbot dashboard and do not see this issue fixed with this
>>>> patch.
>>>>
>>>> Best regards,
>>>> Krzysztof
>>>>
>>>
>>> Hi Krzysztof,
>>>
>>> I've submitted it to syzbot, it is the test request that created at
>>> [2023/11/20 09:39] in dashboard link
>>> https://syzkaller.appspot.com/bug?extid=6eb09d75211863f15e3e
>>
>> ...and I see there two errors.
>>
> These are because I sent email wrongly and syzbot truncates the patch
> and can not compile
>
>> I don't know, maybe I miss something obvious (our brains like to do it
>> sometimes), but please explain me how this could fix anything?
>>
>> Best regards,
>> Krzysztof
>>
>
> The issue arises when an skb is added to the send_buff after invoking
> ndev->ops->close() but before unregistering the device. In such cases,
> the virtual device will generate a copy of skb, but with no consumer
> thereafter. Consequently, this object persists indefinitely.
>
> This problem seems to stem from the existence of time gaps between
> ops->close() and the destruction of the workqueue. During this interval,
> incoming requests continue to trigger the send function.

I asked how this could fix anything. Can you respond to my original comment?

Look:

>>>> This code looks not effective. At this point vdev->send_buff is always
>>>> false, so the additional check would not bring any value.

Best regards,
Krzysztof