Re: [RFC] vhost: Clear the pending messages on vhost_init_device_iotlb()

From: Eric Auger
Date: Tue Nov 08 2022 - 05:18:59 EST


Hi Michael, Jason,

On 11/8/22 10:31, Michael S. Tsirkin wrote:
> On Tue, Nov 08, 2022 at 05:13:50PM +0800, Jason Wang wrote:
>> On Tue, Nov 8, 2022 at 4:56 PM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote:
>>> On Tue, Nov 08, 2022 at 11:09:36AM +0800, Jason Wang wrote:
>>>> On Tue, Nov 8, 2022 at 7:06 AM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote:
>>>>> On Mon, Nov 07, 2022 at 10:10:06PM +0100, Eric Auger wrote:
>>>>>> Hi Michael,
>>>>>> On 11/7/22 21:42, Michael S. Tsirkin wrote:
>>>>>>> On Mon, Nov 07, 2022 at 09:34:31PM +0100, Eric Auger wrote:
>>>>>>>> When the vhost iotlb is used along with a guest virtual iommu
>>>>>>>> and the guest gets rebooted, some MISS messages may have been
>>>>>>>> recorded just before the reboot and spuriously executed by
>>>>>>>> the virtual iommu after the reboot. Despite the device iotlb gets
>>>>>>>> re-initialized, the messages are not cleared. Fix that by calling
>>>>>>>> vhost_clear_msg() at the end of vhost_init_device_iotlb().
>>>>>>>>
>>>>>>>> Signed-off-by: Eric Auger <eric.auger@xxxxxxxxxx>
>>>>>>>> ---
>>>>>>>> drivers/vhost/vhost.c | 1 +
>>>>>>>> 1 file changed, 1 insertion(+)
>>>>>>>>
>>>>>>>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
>>>>>>>> index 40097826cff0..422a1fdee0ca 100644
>>>>>>>> --- a/drivers/vhost/vhost.c
>>>>>>>> +++ b/drivers/vhost/vhost.c
>>>>>>>> @@ -1751,6 +1751,7 @@ int vhost_init_device_iotlb(struct vhost_dev *d, bool enabled)
>>>>>>>> }
>>>>>>>>
>>>>>>>> vhost_iotlb_free(oiotlb);
>>>>>>>> + vhost_clear_msg(d);
>>>>>>>>
>>>>>>>> return 0;
>>>>>>>> }
>>>>>>> Hmm. Can't messages meanwhile get processes and affect the
>>>>>>> new iotlb?
>>>>>> Isn't the msg processing stopped at the moment this function is called
>>>>>> (VHOST_SET_FEATURES)?
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> Eric
>>>>> It's pretty late here I'm not sure. You tell me what prevents it.
>>>> So the proposed code assumes that Qemu doesn't process device IOTLB
>>>> before VHOST_SET_FEAETURES. Consider there's no reset in the general
>>>> vhost uAPI, I wonder if it's better to move the clear to device code
>>>> like VHOST_NET_SET_BACKEND. So we can clear it per vq?
>>> Hmm this makes no sense to me. iommu sits between backend
>>> and frontend. Tying one to another is going to backfire.
>> I think we need to emulate what real devices are doing. Device should
>> clear the page fault message during reset, so the driver won't read
>> anything after reset. But we don't have a per device stop or reset
>> message for vhost-net. That's why the VHOST_NET_SET_BACKEND came into
>> my mind.
> That's not a reset message. Userspace can switch backends at will.
> I guess we could check when backend is set to -1.
> It's a hack but might work.
>
>>> I'm thinking more along the lines of doing everything
>>> under iotlb_lock.
>> I think the problem is we need to find a proper place to clear the
>> message. So I don't get how iotlb_lock can help: the message could be
>> still read from user space after the backend is set to NULL.
>>
>> Thanks
> Well I think the real problem is this.
>
> vhost_net_set_features does:
>
> if ((features & (1ULL << VIRTIO_F_ACCESS_PLATFORM))) {
> if (vhost_init_device_iotlb(&n->dev, true))
> goto out_unlock;
> }
>
>
> so we get a new iotlb each time features are set.
>
> But features can be changes while device is running.
> E.g.
> VHOST_F_LOG_ALL
>
>
> Let's just say this hack of reusing feature bits for backend
> was not my brightest idea :(
>

Isn't vhost_init_device_iotlb() racy then, as d->iotlb is first updated with niotlb and later d->vqs[i]->iotlb is updated with niotlb. What does garantee this is done atomically?

Shouldn't we hold the dev->mutex to make all the sequence atomic and
include vhost_clear_msg()?  Can't the vhost_clear_msg() take the dev lock?

Thanks

Eric

>
>
>
>>>
>>>
>>>>> BTW vhost_init_device_iotlb gets enabled parameter but ignores
>>>>> it, we really should drop that.
>>>> Yes.
>>>>
>>>>> Also, it looks like if features are set with VIRTIO_F_ACCESS_PLATFORM
>>>>> and then cleared, iotlb is not properly cleared - bug?
>>>> Not sure, old IOTLB may still work. But for safety, we need to disable
>>>> device IOTLB in this case.
>>>>
>>>> Thanks
>>>>
>>>>>
>>>>>>>
>>>>>>>> --
>>>>>>>> 2.37.3